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PREFACE TO THE FIRST EDITION 


Galois theory is a wonderful part of mathematics. Its historical roots date back to the 
solution of cubic and quartic equations in the sixteenth century. But besides helping 
us understand the roots of polynomials, Galois theory also gave birth to many of the 
central concepts of modern algebra, including groups and fields. In addition, there is 
the human drama of Evariste Galois, whose death at age 20 left us with the brilliant 
but not fully developed ideas that eventually led to Galois theory. 

Besides being great history, Galois theory is also great mathematics. This is due 
primarily to two factors: first, its surprising link between group theory and the roots 
of polynomials, and second, the elegance of its presentation. Galois theory is often 
described as one of the most beautiful parts of mathematics. 

This book was written in an attempt to do justice to both the history and the power 
of Galois theory. My goal is for students to appreciate the elegance of the theory and 
simultaneously have a strong sense of where it came from. 

The book is intended for undergraduates, so that many graduate-level topics are 
not covered. On the other hand, the book does discuss a broad range of topics, 
including symmetric polynomials, angle trisections via origami, Galois’s criterion 
for an irreducible polynomial of prime degree to be solvable by radicals, and Abel’s 
theorem about ruler-and-compass constructions on the lemniscate. 


A. Structure of the Text. The text is divided into chapters and sections. We use 
the following numbering conventions: 
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xviii PREFACE TO THE FIRST EDITION 


e Theorems, lemmas, definitions, examples, etc., are numbered according to chapter 
and section. For example, the third section of Chapter 7 is called Section 7.3. This 
section begins with Theorem 7.3.1, Corollary 7.3.2, and Example 7.3.3. 

e In contrast, equations are numbered according to the chapter. For example, (4.1) 
means the first numbered equation of Chapter 4. 


Sections are sometimes divided informally into subsections labeled A, B, C, etc. In 

addition, many sections contain endnotes of two types: 

e Mathematical Notes develop the ideas introduced in the section. Each idea is 
announced with a small black square = . 


e Historical Notes explain some of the history behind the concepts introduced in the 
section. 


The symbol m denotes the end of a proof or the absence of a proof, and <> denotes 
the end of an example. 

References in the text use one of two formats: 
e References to the bibliography at the end of the book are given by the author’s 


last name, as in [Abel]. When there is more than one item by a given author, we 
add numbers, as in [Jordan1] and [Jordan2]. 


e Some more specialized references are listed at the end of the chapter in which 
the reference occurs. These references are listed numerically, so that if you are 
reading Chapter 10, then [1] means the first reference at the end of that chapter. 


The text has numerous exercises, many more than can be assigned during an 
actual course. Some of the exercises can be used as exam questions. Hints to 
selected exercises can be found in Appendix B. 

The algebra needed for the book is covered in Appendix A. Students should read 
Sections A.1 and A.2 before starting Chapter 1. 


B. The Four Parts. The book is organized into four parts. Part I (Chapters 1 to 3) 
focuses on polynomials. Here, we study cubic polynomials, symmetric polynomials 
and prove the Fundamental Theorem of Algebra. In Part II (Chapters 4 to 7), the focus 
shifts to fields, where we develop their basic properties and prove the Fundamental 
Theorem of Galois Theory. Part III is concerned with the following applications of 
Galois theory: 


e Chapter 8 discusses solvability by radicals. 

e Chapter 9 treats cyclotomic equations. 

e Chapter 10 explores geometric constructions. 

e Chapter 11 studies finite fields. 

Finally, Part IV covers the following further topics: 

e Chapter 12 discusses the work of Lagrange, Galois, and Kronecker. 

e Chapter 13 explains how to compute Galois groups. 

e Chapter 14 treats solvability by radicals for polynomials of prime power degree. 
e Chapter 15 proves Abel’s theorem on the lemniscate. 
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C. Notes to the Instructor. Many books on Galois theory have been strongly 
influenced by Artin’s thin but elegant presentation [Artin]. This book is different. In 
particular: 


e Symmetric polynomials and the Theorem of the Primitive Element are used to 
prove some of the main results of Galois theory. 


e The historical context of Galois theory is discussed in detail. 


These choices reflect my personal preferences and my conviction that students need 
to know what an idea really means and where it came from before they can fully 
appreciate its elegance. The result is a book which is definitely not thin, though I 
hope that the elegance comes through. 

The core of the book consists of Parts I and II (Chapters 1 to 7). It should be 
possible to cover this material in about 9 weeks, assuming three lectures per week. 
In the remainder of the course, the instructor can pick and choose sections from Parts 
III and IV. These chapters can also be used for reading courses, student projects, or 
independent study. 

Here are some other comments for the instructor: 


Sections labeled “Optional” can be skipped without loss of continuity. I sometimes 
assign the optional section on Abelian equations (Section 6.5) as part of a take- 
home exam. 

Students typically will have seen most but not all of the algebra in Appendix A. 
My suggestion is to survey the class about what parts of Appendix A are new to 
them. These topics can then be covered when needed in the text. 

e For the most part, the Mathematical Notes and Historical Notes are not used in 
the subsequent text, though I find that they stimulate some interesting classroom 
discussions. The exception is Chapter 12, which draws on the Historical Notes of 
earlier chapters. 


D. Acknowledgments. The manuscript of this book was completed during a 
Mellon 8 sabbatical funded by the Mellon Foundation and Amherst College. I am 
very grateful for their support. I also want to express my indebtedness to the authors 
of the many fine presentations of Galois theory listed at the end of the book. 

I am especially grateful to Joseph Fineman, Walt Parry, Abe Shenitzer, and Jerry 
Shurman for their careful reading of the manuscript. I would also like to thank 
Kamran Divaani-Aazar, Harold Edwards, Alexander Hulpke, Teresa Krick, Barry 
Mazur, John MCKay, Norton Starr, and Siman Wong for their help. 

The students who took courses at Amherst College based on preliminary versions 
of the manuscript contributed many useful comments and suggestions. I thank them 
all and dedicate this book to students (of all ages) who undertake the study of this 
wonderful subject. 


Davip A. Cox 
May 2004, Amherst, Massachusetts 


PREFACE TO THE SECOND EDITION 


For the second edition, the following changes have been made: 


e Numerous typographical errors were corrected. 

e Some exercises were dropped and others were added, a net gain of six. 

e Section 13.3 contains anew subsection on the Galois group of irreducible separable 
quartics in all characteristics, based on ideas of Keith Conrad. 

e The discussion of Maple in Section 2.3 was updated. 

e Sixteen new references were added. 

e The notation section was expanded to include all notation used in the text. 


e Appendix C on student projects was added at the end of the book. 


I would like thank Keith Conrad for permission to include his treatment of quartics 
in all characteristics in Section 13.3. Thanks also go to Alexander Hulpke for his help 
in updating the references to Chapter 14, and to Takeshi Kajiwara and Akira Iino for 
the improved proof of Lemma 14.4.5 and for the many typos they found in preparing 
the Japanese translation of the first edition. I also appreciate the suggestions made 
by the reviewers of the proposal for the second edition. 

I am extremely grateful to the many readers who sent me comments and typos 
they found in the first edition. There are too many to name here, but be assured that 
you have my thanks. Any errors in the second edition are my responsibility. 
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xxii PREFACE TO THE SECOND EDITION 


Here is a chart that shows the relation between the 15 chapters and the 4 parts of 
the book: 


Part III 


rant) Ti 


Information about the book, including typo lists, can be found at 
http: //www.cs.amherst.edu/~dac/galois.html 
As always, comments and corrections are welcome. 


Davip A. Cox 


December 2011, Amherst, Massachusetts 


NOTATION 


1 BASIC NOTATION 


Standard Rings and Fields. We use the following standard notation: 


Z_ ring of integers, 

Q field of rational numbers, 
R field of real numbers, 

C field of complex numbers. 


Sets. We use the usual notation for union U and intersection M, and we define 


A\B={xE€A|x¢B}, 
|S| = the number of elements in a finite set S. 
We write A C B to indicate that A is a subset of B. (Some texts write A C B for an 
arbitrary subset and reserve A C B for the case when A is strictly smaller than B. We 
do not follow this practice.) Thus A = B if and only if AC B and BCA. Finally, 
given sets A and B, their Cartesian product is 
Ax B= {(a,b)|a€A,b € B}. 
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xxiv NOTATION 


Functions. A function f : A — B is sometimes denoted x + f(x). Also, a one-to- 
one onto map (a one-to-one correspondence) is often written 


F:ADB. 
If S is any set, then the identity map 
Ils: S3S 
is defined by ss for s € S. Also, given f : A > B, we have: 
Fac :Ago > B restriction of f to Ap C A, 
f(Ao) = {f(a)|a@€Ao} image of Ap C A under f, 
f7'(Bo) ={a€A| f(a) € Bo} inverse image of By C B under f. 


The Integers. For integers a,b,n € Z with n > 0, we define: 


a|b_ bis an integer multiple of a, 
a{b bis not an integer multiple of a, 
a=bmodn nia—b, 
gcd(a,b) greatest common divisor of a,b, 
Icm(a,b) least common multiple of a,b, 
@(n) = |(Z/nZ)*| Euler ¢-function. 
The Complex Numbers. Properties of C are reviewed in Section A.2. Also: 
Re(z), Im(z) real and imaginary parts of z € C, 
Z, |z| complex conjugate and absolute value of z € C, 
e” —cos@+isin@ Euler’s formula, 
z= |zle® polar representation of z € C, 
6, =e™/" primitive nth root of unity 
S'={e®|6ER} unitcircleinC ~ R’. 
Groups. Basic properties of groups are reviewed in Section A.1. Also: 


o(g) order of an element g € G, 
(S) subgroup generated by S.C G, 
gH, Hg left and right cosets of subgroup H C G, 
G/H quotient of group G by normal subgroup H, 
S, symmetric group on n letters, 
A, alternating group on n letters, 
D>, dihedral group of order 2n, 
sgn(o) signofoa €S,, 
Ker(y), Im(y) kernel and image of group homomorphism y. 


CHAPTER-BY-CHAPTER NOTATION 


Rings. Basic properties of rings are reviewed in Section A.1. Also: 


Ker(y), Im(y) kernel and image of ring homomorphism ¢, 
(r},---,%) ideal generated by r,,...,7n ER, 
R/I quotient of ring R by ideal /, 
R* group of units of a ring R. 


2 CHAPTER-BY-CHAPTER NOTATION 


XXV 


Here we list the notation introduced in each chapter of the text, followed by the page 
number where the notation is defined. Many of these items appear in the index, which 


lists other important pages where the notation is used. 


Chapter 1 Notation. 


w=€, primitive cube root of unity 6 
D=4q' +4p?/27 quantity appearing in Cardan’s formula 11 
A= -27D discriminant of y? + py+q 12 


Chapter 2 Notation. 


F[x1,.--,%n] | polynomial ring in variables x;,...,x, over F 
deg(f) total degree of f € F[x;,...,xn] \ {0} 
F(x,,..-,%n) field of rational functions in x;,...,x, over F 
O1,..-;0, elementary symmetric polynomials 


yt xe" symmetric polynomial built from x7": - +x?" 
x"—o1x""!4..-+(-1)"on universal polynomial of degree n 
A, VA universal discriminant and its square root 
A(f) discriminant of f € F [x] 


Chapter 3 Notation. 
F CL _ Lisanextension field of F 58 


Chapter 4 Notation. 


®,(x) nth cyclotomic polynomial 75 
F[ay,...,Q,| | subring generated by F and ay,...,@,€L 75 
F(ay,...,Qn) subfield generated by F and ay,...,@,€L 76 

¥, =Z/pZ finite field with p elements, p prime 84 

[L:F] degree of field extension F C L 89 


Q field of algebraic numbers 96 


26 
26 
26 
27 
33 
37 
46 
47 


xxvi NOTATION 


Chapter 5 Notation. 
Res(f,g,x) resultantof f,g€ F[x} 115 
Chapter 6 Notation. 


Gal(L/F) Galois group of extension F C L 125 
AGL(1,F,) one-dimensional affine linear group 137 
F(oj,.--;0n) C F(x1,..-,%n) universal extension 138 


Chapter 7 Notation. 


Ly fixed field of H C Gal(L/F) 147 

oK  conjuate field of FC K CLforo €Gal(L/F) 154 
Ne(H) _ normalizer of subgroup H Cc G 159 
GL(2,F) general linear group of F? 178 
PGL(2,F ‘) projective linear group of F? 179 
F,¢ FU {oo}, CU {co} 180 

S? unit sphere in R? 180 
Rot(S?) rotation group of S? 181 


Chapter 8 Notation. 


K,K, CL compositumof K,,K, CL 198 


B+¢7'0(8)+¢-7%07(8)+ --- Lagrange resolvent 203 
Chapter 9 Notation. 
¢,=e"/" — primitive nth root of unity 229 
®,(x) nth cyclotomic polynomial 229 
ef=p-—1__ factorization of p — 1, p prime 238 
Hy C (Z/pZ)* unique subgroup of order f 238 


Ly CQ(¢,) _ fixed field corresponding to Hy 238 
(f,A) f-period, primitive element of L; 240 


Chapter 10 Notation. 


@ field of constructible numbers 257 
® field of Pythagorean numbers 265 
+1  Fermat number 270 
field of origami numbers 277 


F,, = 22" 


Q 
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Chapter 11 Notation. 


F,=F» finite field with ¢ = p” elements, p prime 293 
Frob, Frobenius automorphism of F, 294 
GL(n,F) general linear group of F” 296 
Nm number of monic degree m irreducible f € F,[x] 301 

p(n) Mobius function 302 


Chapter 12 Notation. 


A(x) resolvent polynomial of p € L= F(x,...,%,) 317 


H(y) CS, isotropy group of py € L= F(x,...,%n) 318 

xy t+7'x2 +€77x3 + ++» Lagrange resolvent 328 
s(y) Galois resolvent 335 
V=ta,+--:+t,Q, primitive element used by Galois 336 

mR’, KR", MR"... algebraic quantities used by Kronecker 348 


Chapter 13 Notation. 


@(y) Ferrari resolvent of quartic f 358 

O(y)  sextic resolvent of quintic f 373 

O;(y) general resolvent of f 387 
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POLYNOMIALS 


The first three chapters focus on polynomials and their roots. 

We begin in Chapter 1 with cubic polynomials. The goal is to derive Cardan’s 
formulas and to see how the permutations of the roots influence things. 

Then, in Chapter 2, we learn how to express the coefficients of a polynomial as 
certain symmetric polynomials in the roots. This leads to questions about describing 
all symmetric polynomials. We also discuss the discriminant. 

Finally, in Chapter 3, we show that all polynomials have roots in a possibly larger 
field. We also prove the Fundamental Theorem of Algebra, which asserts that the 
roots of a polynomial with complex coefficients are complex numbers. 


CHAPTER 1 


CUBIC EQUATIONS 


The quadratic formula states that the solutions of a quadratic equation 


ax?+bx+c=0, a,b,cEC,a¥0 


are given by 
—b+V/b? —4ac 
(1.1) x= aa 
a 


In this chapter we will consider a cubic equation 


ax+bx?+cx+d=0, a,b,c,dEC, aF0, 


and we will show that the solutions of this equation are given by a similar though 
somewhat more complicated formula. Finding the formula will not be difficult, but 
understanding where it comes from and what it means will lead to some interesting 


questions. 


Galois Theory, Second Edition. By David A. Cox 
Copyright © 2012 John Wiley & Sons, Inc. 


4 CUBIC EQUATIONS 


1.1 CARDAN’S FORMULAS 


Given a cubic equation ax? + bx? +cx +d =0 with a £0, we first divide by a to 
rewrite the equation as 


xetbx?t+ext+d=0, b,c,d EC, 


where b/a, c/a, and d/a have been replaced with b, c, and d, respectively. Observe 
that x? + bx? + cx +d is a monic polynomial and that reducing to the monic case has 
no effect on the roots. 

The next step is to remove the coefficient of x* by the substitution 


FHV 3° 


The binomial theorem implies that 
b\2 , 2b b 
)=y 


b 
202 * ‘ 
Paya ws4(5 


Par bon)-(G) ar oF, 


so that 
O=x +bx?+ex+d 
2 3 


= (Pb Ba-)o06— Bre 8) ele B) a 


If we collect terms, then we can write the resulting equation in y as 


y'+py+q=0, 
where 
b? 
PH=~3z te, 
1.2 
(1.2) 2b) be | 
I~ 97 3 


You will verify the details of this calculation in Exercise 1. 

We call a cubic of the form y? + py+ q =0 a reduced cubic. If we can find 
the roots y1,y2,y3 of the reduced cubic, then we get the roots of the original cubic 
x + bx? +cx+d = 0 by adding —b/3 to each yj. 

To solve y> + py + q = 0, we use the substitution 


Pp 
1.3 =z-=—. 
(1.3) y=Z 3g 
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This change of variable has a dramatic effect on the equation. Using the binomial 
theorem again, we obtain 


3 3 2P p\? p3 3 p p? 
~2 a2 Lan L)-(B) =? mt Fe 
y Zz Z 327 Zz 32 3z Zz — pzt 32 27B 
Combining this with (1.3) gives 
2 3 3 
3 =(23 - r_F.) (.-2) —3_ PP 
y+ pyta=(z PEt se a7) TPR ag) TI H 2 — aga FF 


Multiplying by z?, we conclude that y? + py + q = 0 is equivalent to the equation 


6 3 P 
1.4 —-=0. 
(1.4) D+ Ge — xa 
This equation is the cubic resolvent of the reduced cubic y? + py +q=0. 
At first glance, (1.4) might not seem useful, since we have replaced a cubic 
equation with one of degree 6. However, upon closer inspection, we see that the 
cubic resolvent can be written as 


(2°)? +42 - p =0. 
27 


By the quadratic formula (1.1), we obtain 


so that 


_ 2/1 4/@2 4p? 
(1.5) Z= ;(-2 P +55 J: 


Substituting this into (1.3) gives a root of the reduced cubic y> + py+q, and then 

x = y—b/3 is a root of the cubic x? + bx? + cx+d. 
However, before we can claim to have solved the cubic, there are several questions 

that need to be answered: 

e By setting y? + py + q = 0, we essentially assumed that a solution exists. What 
justifies this assumption? 

e A cubic equation has three roots, yet the cubic resolvent has degree 6. Why? 

e The substitution (1.3) assumes that z 4 0. What happens when z = 0? 

e y+ py-+q has coefficients in C, since b,c,d € C. Thus (1.5) involves square 
roots and cube roots of complex numbers. How are these described? 

The first bullet will be answered in Chapter 3 when we discuss the existence of roots. 

The second bullet will be considered in Section 1.2, though the ultimate answer will 
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involve Galois theory. For the rest of this section, we will concentrate on the last two 
bullets. Our strategy will be to study the formula (1.5) in more detail. 

First assume that p # 0 in the reduced cubic y> + py+gq. By Section A.2, every 
nonzero complex number has n distinct nth roots when n € Z is positive. In (1.5), 
the + in the formula indicates that a nonzero complex number has two square roots. 
Similarly, the cube root symbol denotes any of the three cube roots of the complex 
number under the radical. To understand these cube roots, we use the cube roots of 
unity 1,¢,,¢3 from Section A.2. We will write ¢, as w. Recall that 

w= =e" = —1+iv3 
2 
and that given one cube root of a nonzero complex number, we get the other two cube 
roots by multiplying by w and w’. 
We can now make sense of (1.5). Let 


ip 
27 
denote a fixed square root of q* + 4p?/27 € C. With this choice of square root, let 


afl 2 4p3 
a= 4/5 ( It VT +55 


denote a fixed cube root of }(— q+ /q? +4p3/27). Then we get the other two cube 
roots by multiplying by w and w*. Note also that p ¢ 0 implies that z; 4 0 and that 
z; is a root of the cubic resolvent (1.4). It follows easily that if we set 


gt 


__ PL 
2= 32° 
then 
(1.6) ymeutosa--— 


3z1 


is a root of the reduced cubic y? + py +q. 
To understand z2, observe that 


An easy calculation shows that 


1 4p? 1 4p>\ 1 4p3 Pp 
3. fg 24 27 Jif _ 2477 |). lo fig— 24-f jJu_f 
3 (-a e+) 5(-a+ ToT) at VF Ft 97 27 


Since z, # 0, these formulas imply that 


1 4p3 
d=5(-a- ere), 
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Hence z2 = —p/3z, is a cube root of 5 (-4 —J/@t 4p°/21), so that 


_ afl 2 4p3 afl 2 4p3 
(1.7) A= 3( q+ VF + 55 and Z2 = 5) q q+ 55 


are cube roots with the property that their product is —p/3. 

From (1.6), we see that y, = z; +22 is a root of y? + py +q when z; and Z2 are 
the above cube roots. To get the other roots, note that (1.6) gives a root of the 
cubic whenever the cube roots are chosen so that their product is —p/3 (be sure you 
understand this). For example, if we use the cube root wz;, then 


P 
W221 Ww? 2Z9 =H2122=-F5 


3 


shows that yz = wz; + wz, is also a root. Similarly, using the cube root w?z; shows 
that y3 = wz, +w2z, is a third root of the reduced cubic. 
By (1.7), it follows that the three roots of y? + py +q = O are given by 


_ fl » 4p 3/1 >. 4p 
y= 3 (-4+ T+ > + a\—4 tar pp 


27 27 

1 4p3 1 4p? 

= wr] —(— 242P i (_g—4/q24 2 
y3 SW 5(-4+ e+) +e 3(-4 e+), 


provided the cube roots in (1.7) are chosen so that their product is —p/3. These are 
Cardan’s formulas for the roots of the reduced cubic y? + py-+ q. 


Example 1.1.1 For the reduced cubic y? + 3y+ 1, consider the real cube roots 


Their product is —1 = —p/3, so by Cardan’s formulas, the roots of y? +3y + 1 are 
m= Va-1+v5)+  /F(-1-V9), 
yp =w4/$(—14+ V5) +04/4(-1-V5), 
yy =ws/4(=14-V3) +w9/E(-1 V5). 


Note that y; is real. In Exercise 2 you will show that y2 and y3 are complex conjugates 
of each other. << 


Although Cardan’s formulas only apply to a reduced cubic, we get formulas for 
the roots of an arbitrary monic cubic polynomial x? + bx? +.cx +d € C[x] as follows. 
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The substitution x = y — b/3 gives the reduced cubic y* + py +q = 0, where p and q 
are as in (1.2). If z, and z are the cube roots in Cardan’s formulas for y* + py+q = 0, 
then the roots of x? + bx? +. cx + d = 0 are given by 


b 
X,=—- > t+%4+22, 


3 

b 2 
m= ZF we Fw 22, 

b. 2 
Bama tw Z1 +wZ2, 


where z; and z2 from (1.7) satisfy z1z2 = —p/3. Our derivation assumed p # 0, but 
these formulas give the correct roots even when p = 0 (see Exercise 3). 
We will eventually see that Cardan’s formulas make perfect sense from the point 


of view of Galois theory. For example, the quantity under the square root in (1.5) is 
4p? 

2 —_—_— 
qt 7° 


Up to a constant factor, this is the discriminant of the polynomial y? + py+q. We 
will give a careful definition of discriminant in Section 1.2, and Section 1.3 will show 
that the discriminant gives useful information about the roots of a real cubic. 

Here is an example of a puzzle that arises when using Cardan’s formula. 


Example 1.1.2 The cubic equation y? — 3y = 0 has roots y = 0,+¥V3, all of which 
are real. When we apply Cardan’s formulas, we begin with 


z= 4(—0+4 02 +424) a Vi. 


To pick a specific value for z;, notice that (-i)? = I, so that we can take z; = —i. 
Thus z2 = —p/3z, = i, since p = —3. Then Cardan’s formulas give the roots 


yy = -i+i=0, 
y2 = w(—i) +0 (i) = V3, 
y3 =w?(—i) +w(i) = - v3. 
(You will verify the last two formulas in Exercise 4.) << 


The surprise is that Cardan’s formulas express the real roots of y? — 3y in terms of 
complex numbers. In Section 1.3, we will prove that for any cubic with distinct real 
roots, Cardan’s formulas always involve complex numbers. 


Historical Notes 


The quadratic formula is very old, dating back to the Babylonians, circa 1700 B.c. 
Cubic equations were first studied systematically by Islamic mathematicians such as 
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Omar Khayyam, and by the Middle Ages cubic equations had become a popular topic. 
For example, when Leonardo of Pisa (also known as Fibonacci) was introduced to 
Emperor Frederick II in 1225, Fibonacci was asked to solve two problems, the second 
of which was the cubic equation 


x? + 2x7 + 10x = 20. 


Fibonacci’s solution was 


=1 22 #7 42 33 4 40 
*="* 60 * 602 * 605 * 60% * 605 * 60%" 
In decimal notation, this gives x = 1.368808107853 ..., whichis correct to 10 decimal 
places. Not bad for 787 years ago! 

Challenges and contests involving cubic equations were not uncommon during 
the Middle Ages, and one such contest played a crucial role in the development of 
Cardan’s formula. Early in the sixteenth century, Scipio del Ferro found a solution 
for cubics of the form x? + bx = c, where b and c are positive. His student Florido 
knew this solution, and in 1535, Florido challenged Niccold Fontana (also known as 
Tartaglia) to a contest involving 30 cubic equations. Working feverishly in preparation 
for the contest, Tartaglia worked out the solution of this and other cases, and went 
on to defeat Florido. In 1539, Tartaglia told his solution to Girolamo Cardan (or 
Cardano), who published it in 1545 in his book Ars Magna (see [2]). 

Rather than present one solution to the cubic, as we have done here, Cardan’s 
treatment in Ars Magna requires 13 cases. For example, Chapter XIV considers 
x3 +64 = 18x”, and Chapter XV does x? + 6x? = 40. The reason is that Cardan 
prefers positive coefficients. However, he makes systematic use of the substitution 
x = y—b/3 to get rid of the coefficient of x”, and Cardan was also aware that complex 
numbers can arise in solutions of quadratic equations. 

Numerous other people worked to simplify and understand Cardan’s solution. In 
1550, Rafael Bombelli considered more carefully the role of complex solutions (see 
Section 1.3), and in two papers published posthumously in 1615, Frangois Viéte (or 
Vieta, in Latin) introduced the substitution (1.3) used in our derivation of Cardan’s 
formulas and gave the trigonometric solution to be discussed in Section 1.3 . 

In addition to the cubic, Ars Magna also contained a solution for the quartic 
equation due to Lodovico (or Luigi) Ferrari, a student of Cardan’s. We will discuss 
the solution of the quartic in Chapter 12. 


Exercises for Section 1.1 

Exercise 1. Complete the demonstration (begun in the text) that the substitution x = y — b/3 
transforms x° + bx? + cx+d into y?+ py+4q, where p and q are given by (1.2). 

Exercise 2, In Example 1.1.1, show that y2 and y3 are complex conjugates of each other. 
Exercise 3. Show that Cardan’s formulas give the roots of y* + py+q when p = 0. 


Exercise 4. Verify the formulas for y2 and y3 in Example 1.1.2. 
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Exercise 5. The substitution x = y — b/3 can be adapted to other equations as follows. 
(a) Show that x = y— b/2 gets rid of the coefficient of x in the quadratic equation x” + bx+c = 
0. Then use this to derive the quadratic formula. 
(b) For the quartic equation x* + bx? + cx* + dx+e¢ = 0, what substitution should you use to 
get rid of the coefficient of x°? 
(c) Explain how part (b) generalizes to a monic equation of degree n. 


Exercise 6. Consider the equation x° +x —2=0. Note that x = 1 is a root. 
(a) Use Cardan’s formulas (carefully) to derive the surprising formula 


raise lta ii-3 7 
~ 3) 3 3V3° 


(b) Show that 1 + 2/2 =($+3 1)>, and use this to explain the result of part (a). 


Exercise 7. Cardan’s formulas, as stated in the text, express the roots as sums of two cube 
roots. Each cube root has three values, so there are nine different possible values for the sum 
of the cube roots. Show that these nine values are the roots of the equations y? + py+ q—0, 
y? + wpy+q=0, and y? +w* py +q = 0, where as usual w = $(—1 +3). 


Exercise 8. Use Cardan’s formulas to solve y° + 3wy + 1 =0. 


1.2 PERMUTATIONS OF THE ROOTS 
In Section 1.1 we learned that the roots of x? + bx” + cx +d = 0 are given by 


b 
y= -54+%4+22, 


3 
b 2 
(1.8) m= — 3 twatw 22, 
b 2 
3= “3 + WZ + W222, 


where z; and Z2 are the cube roots (1.7) chosen so that 2122 = —p/3. We also know 
that z, is a root of the cubic resolvent 


6 3 p? 
1.9 Z+qz-s = 
(1.9) q 7 
and in Exercise | you will show that Zz is also a root of (1.9). The goal of this section 
is to understand more clearly the relation between x, ,x2,x3 and Z;,22. We will learn 
that permutations, the discriminant, and symmetric polynomials play an important 
role in these formulas. 


0, 


A. Permutations. We begin by observing that we can use (1.8) to express 2), 22 in 
terms of x; ,x2,x3. We do this by multiplying the second equation by w? and the third 
by w. When we add the three resulting equations, we obtain 


b 
x1 +w? x9 +wx3 = —(1+w? +w)3 +37 + (1+w+w)z. 
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However, w is a root of x? — 1] = (x—1)(x?+x+1), which implies 1 +w+w? =0. 
Thus the above equation simplifies to 

xy + wx, +wx3 = 3z1, 
so that 


3 + w?x2 + wx3). 


Similarly, multiplying the second equation of (1.8) by w and the third by w? leads to 
the formula 


qi 


] 
D= iad + wx2 +w?x3). 


This shows that the roots z; and z2 of the cubic resolvent can be expressed in terms 
of the roots of the original cubic. However, z, and z2 are only two of the six roots of 
(1.9). What about the other four? In Exercise 1 you will show that the roots of the 
cubic resolvent (1.9) are 


2 2 
Zi, 22, WZ, WZ2, Wo Zi, W Z2, 
and that these roots are given in terms of x,,x2,x3 by 


(x1 +w?x> +WwXx3), 


(x, +w*x3 +x), 


2 
X3+W°X2 +Wwx,), 


3 ) 
3 ) 
(1.10) een 
x ) 
3 ) 


(x3 +.w7x, +wx2), 
wz2 = F(x, + wx) +wx3). 


These expressions for the roots of the resolvent all look similar. What lies behind 
this similarity is the following crucial fact: The six roots of the cubic resolvent are 
obtained from z, by permuting x,,x2,x3. Hence the symmetric group 53 now enters 
the picture. 

From an intuitive point of view, this is reasonable, since labeling the roots x1, x2, x3 
simply lists them in one particular order. If we list the roots in a different order, then 
we should still get a root of the resolvent. This also explains why the cubic resolvent 
has degree 6, since |S3| = 6. 


B. The Discriminant. We can also use (1.10) to get a better understanding of the 
square root that appears in Cardan’s formulas. If we set 


4 3 
(1.11) D=@+—— 


then we can write z, and z) as 


(1.12) 
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We claim that D can be expressed in terms of the roots x) ,x2,x3. To see why, note 
that the above formulas imply that 


zi — 2 = 3(-@+ VD) — 3(-¢—- VD) = vD. 
However, (A.15) gives the factorization 
(1.13) Zp—23 = (2, — 2) (21 —w22)(zy — wz). 
Using (1.10), we obtain 


21 — 22 = 4 (x1 +w2x2 +wx3) — $ (x1 + wx2 + w?x3) 
1 
3 


(w? — w) (x2 — x3) 


2 


where the last line uses w? — w = —iv/3. Similarly, one can show that 


. 2 
(1.14) 21 wea = 5 (a1 —%3); 
Wy wZ = Fn —X2) 


(see Exercise 2). Combining these formulas with z} — 23 = VD and (1.13) easily 
implies that 


(1.15) VB =~ (01 —22)(01 9) (02-9) 


If we square this formula for VD and combine it with (1.11), we obtain 


4p? 1 
(1.16) gt Ps — <= (x1 — x2)? (xy — x3)? (x2 — x3)’. 


It is customary to define the discriminant of x? + bx? + cx+d to be 
A = (x1 — x2)" (x _ x3)? (x2 —x3)*. 


Thus A is the product of the squares of the differences of the roots. In this notation 
we can write (1.16) as 


4 3 
(1.17) G+ =-— 


Then (1.12) becomes 


(1.18) a=(/3(-a+/38) and a=4/1(-4-/s). 


Substituting this into (1.8), we get a version of Cardan’s formulas which uses the 
square root of the discriminant. 
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The discriminant is also important in the quadratic case. By the quadratic formula, 
the roots of x? + bx +c are 


—b+VJVA -b-VJVA 
1 =— and = — J 


where A = b? — 4c is the discriminant. This makes it easy to see that 
VA=x,—-x%. and A=(x,—x)?. 


Thus the discriminant is the square of the difference of the roots. In Chapter 2 we 
will study the discriminant of a polynomial of degree n. 


C. Symmetric Polynomials. We begin with two interesting properties of 
A = (x1 — x2)? (x1 — 23)? (x2 — 23). 


First suppose that we permute x;,x2,x3 in this formula. The observation is that no 
matter how we do this, we will still have the product of the squares of the differences 
of the roots. This shows that A is unchanged by permutations of the roots. In the 
language of Chapter 2 we say that A is symmetric in the roots x; ,%2,x3. 

Second, we can also express A in terms of the coefficients of x? + bx? +cx+d. 
By (1.17), we know that A = —4p* — 27q?. However, we also have 


b? 
Pp=-3 Te, 
(1.19) ; 
2 be yy 
197 3 


by Exercise 1 of Section 1.1. If we substitute these into (1.17), then a straightforward 
calculation shows that 


(1.20) A = b’c? + 18bed — 4c? — 4b3d — 27d? 
(see Exercise 3). When b = 0, it follows that x° + cx + d has discriminant 
A = —4c3 ~ 27d’. 


This will be useful in Section 1.3. 

The above formula expresses the discriminant in terms of the coefficients of the 
original equation, just as the discriminant of x? + bx +c =0 is A = b? —4c. The 
Fundamental Theorem of Symmetric Polynomials, to be proved in Chapter 2, will 
imply that any symmetric polynomial in x,,x2,x3 can be expressed in terms of the 
coefficients b,c,d. In order to see why b,c,d are so important, note that if x;,x2,x3 
are the roots of x? + bx? + cx +d, then 


x3 + bx? +.ex+d = (x—x1)(x—22)(x— x3). 
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Multiplying out the right-hand side and comparing coefficients leads to the following 
formulas for b,c, d: 


b= —(x1 +x. +43), 
(1.21) C= XX. + X1x3 + X2%3, 


d = —X,X2Xx3. 


These formulas show that the coefficients of a cubic can be expressed as symmetric 
functions of its roots. The polynomials b, c, d are (up to sign) the elementary 
symmetric polynomials of x1 ,x2,x3. These polynomials (and their generalization to 
an arbitrary number of variables) will play a crucial role in Chapter 2. 


Mathematical Notes 
One aspect of the text needs further discussion. 


= Algebra versus Abstract Algebra. High school algebra is very different from a 
course on groups, rings, and fields, yet both are called “algebra.” The evolution of 
algebra can be seen in the difference between Section 1.1, where we used high school 
algebra, and this section, where questions about the underlying structure (why does 
the cubic resolvent have degree 6?) led us to realize the importance of permutations. 
Many concepts in abstract algebra came from high school algebra in this way. 


Historical Notes 


In 1770 and 1771, Lagrange’s magnificent treatise Réflexions sur la résolution 
algébrique des équations appeared in the Nouvelles Mémoires de I’ Academie royale 
des Sciences et Belles-Lettres de Berlin. This long paper covers pages 205-421 in 
Volume 3 of Lagrange’s collected works [Lagrange]. It is a leisurely account of the 
known methods for solving equations of degree 3 and 4, together with an analysis of 
these methods from the point of view of permutations. Lagrange wanted to determine 
whether these methods could be adapted to equations of degree > 5. 

One of Lagrange’s powerful ideas is that one should study the roots of a polynomial 
without regard to their possible numerical value. When dealing with functions of the 
roots, such as 


1 
z= Fla + wx, + wx3) 


from (1.10), Lagrange says that he is concerned “only with the form” of such expres- 
sions and not “with their numerical quantity” {Lagrange, Vol. 3, p. 385]. In modern 
terms, Lagrange is saying that we should regard the roots as variables. We will learn 
more about this idea when we discuss the universal polynomial in Chapter 2. 

We will see in Chapter 12 that many basic ideas from group theory and Galois 
theory are implicit in Lagrange’s work. However, Lagrange’s approach fails when 
the roots take on specific numerical values. This is part of why Galois’s work is so 
important: he was able to treat the case when the roots were arbitrary. The ideas of 
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Galois, of course, are the foundation of what we now call Galois theory. This will be 
the main topic of Chapters 4-7. 


Exercises for Section 1.2 


Exercise 1. Let z), 22 be the roots of (1.9) chosen at the beginning of the section. 
(a) Show that z1, 22, wZ1, wZ2, wz), wz) are the six roots of the cubic resolvent. 
(b) Prove (1.10). 


Exercise 2. Prove (1.14) and (1.15). 
Exercise 3. Prove (1.20). 


Exercise 4, We say that a cubic x° + bx? +cx-+d has a multiple root if it can be written as 
(x—r1)?(x— 1). Prove that x? + bx” + cx+d has a multiple root if and only if its discriminant 
is zero. 


Exercise 5. Since A = (x) — x2)?(x; — x3)?(x2 — x3)”, we can define the square root of A to 
be VA = (x1 — x2) (x1 — x3) (x2 — x3). Prove that an even permutation of the roots takes VA 
to VA while an odd permutation takes VA to —VA. In Section 2.4 we will see that this 
generalizes nicely to the case of degree n. 


1.3 CUBIC EQUATIONS OVER THE REAL NUMBERS 


The final topic of this chapter concerns cubic equations with coefficients in the 
field R of real numbers. As in Section 1.1, we can reduce to equations of the form 
y? + py+q =O, where p,g € R. Then Cardan’s formulas show that the roots y,, y2, y3 
lie in the field C of complex numbers. We will show that the sign of the discriminant 
of y>+py+q =0 tells us how many of the roots are real. We will also give an 
unexpected application of trigonometry when the roots are all real. 


A. The Number of Real Roots. The discriminant of y? + py+q is 
A=(y ~y2)’ (1 —y3) (v2 ~ys)°. 

As we noted in the discussion following (1.20), A can be expressed as 

(1.22) A = —4p? — 274’. 


You will give a different proof of this in Exercise 1. 

For the rest of the section we will assume that the cubic y* + py +q has distinct 
roots y1,¥2,y3. It follows that the discriminant A is a nonzero real number. We next 
show that the sign of A gives interesting information about the roots. 


Theorem 1.3.1 Suppose that the polynomial y’ + py +q € Rly] has distinct roots 
and discriminant A # 0. Then: 


(a) A > 0 if and only if the roots of y> + py+q=0are all real. 


(b) A <0 if and only if y> + py+q=0 has only one real root and the other two 
roots are complex conjugates of each other. 
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Proof: First recall from Section A.2 that complex conjugation z+ Z satisfies 
Zw =Z+Wand @W = ZW. It follows that if y; is a root of y> + py+q =0, then 
0=O0=yi+py+g=K +P +4, 


so that y; is also a root. This proves the standard fact that the roots of a polynomial 
with real coefficients either are real (if yy = y,) or come in complex conjugate pairs 


(if Wr # yi). 

If y1,y2,y3 are all real and distinct, then A = (y; — y2)*(y1 — y3)?(y2 — y3)* shows 
that A > 0. If the roots are not all real, then the above discussion shows that we 
must have one real root, say y;, and a complex conjugate pair, say y2 and ¥2. Write 
y2 = u+iv, where u,v € R and v 0. Then y3 = u — iv and 

. \\2 . \\2 . .\\2 
A= (y — (utiv)) (v1 — (u—iv)) ((u + iv) — (ui) 
2 \2 ~\2 fae \2 
= ((y1 —u) — iv)" ((1 — 4) + iv) (2iv) 
= —4v? ((y) —u)+ vy, 
It follows that A < 0 when there is only one real root. This completes the proof. 


In Exercises 2-5, we will sketch a different proof of Theorem 1.3.1 which uses 
curve graphing techniques from calculus. 
We next apply the theory developed so far to Cardan’s formulas 
M= Ut 2, 
y2= wz +w22, 
y3 =z + WZ, 


where the cube roots 


_/1 |, 4p _ 3/1 |, , 4p3 
(1.23) A= 5( qt q+ and z2 = 7 q q +3 


are chosen so that z)z2 = —p/3. 
First, suppose that A <0. Then Theorem 1.3.1 implies that y*> + py +q = 0 has 
precisely one real root. Furthermore, by (1.22), we have 


A = —4p’ —27q’ <0. 


Hence the square root \/q? + 4p3/27 is real, which means that we can take z; to be 
the unique real cube root. Then z;z2 = —p/3 implies that z, is also the real cube root. 
It follows that 


_ afl 2 4p3 3/1 2 4p3 
n=y3(-a+ e+e FAINT VF 99 


expresses the real root of y> + py +q =0 in terms of real radicals. Furthermore, in 
the above formulas for y2 and y3, we see that y3 = y2, since the cube roots are real and 
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w? =. Thus we have a complete understanding of how Cardan’s formulas work 
when the discriminant is negative. 
However, the case when A > 0 is very different. Here, y? + py +g = 0 has three 


real roots by Theorem 1.3.1. Since 
A =-—4p’ —27q’ > 0, 


one value of the square root \/q? + 4p3/27 is 


4p3 —-A ./A 
24°F _,/—8_,;,/2 
VEt a7 Va VF 


Using this and (1.23), we can write z; and zz as the cube roots 


This shows that z,; and zz are both nonreal complex numbers when A > 0. You will 
prove in Exercise 6 that 


(1.24) ziZ2 = -§ => =F. 


Combining (1.24) with Cardan’s formulas, we see that when A > 0, the roots of 
y> + py +q can be written 


y= At By, 
yo= wey +w?z, 
y3 =wz1 + wy. 


The root y; is real, since it is expressed as the sum of a complex number and its 
conjugate. Furthermore, using w? =, one easily sees that 


oy =w*y and wz; =wz, 


so that y2 and y3 are also real, since they too are the sum of a complex number and 
its conjugate. 

Notice that, unlike the case when A < 0, we no longer have a canonical choice of 
z\—it is just one cube root of the complex number 4 (—q+i./A/27). Furthermore, 
we get yj,y2, 3 by taking the three cube roots of this number and adding each to its 
conjugate. This explains how Cardan’s formulas work when A > 0. 

The puzzle, of course, is that we are using complex numbers to express the real 
roots of a real polynomial. Historically, this is referred to as the casus irreducibilis. 
We will have more to say about this below. 


Example 1.3.2 In 1550, Rafael Bombelli applied Cardan’s formulas to the cubic 
y? —15y—4=0. This polynomial has discriminant A = —4(—15)? — 27(-4)? = 
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13068 > 0, so that all three roots are real. Bombelli noted that one root is y = 4 and 
used Cardan’s formulas to show that 


=V241litV2-1li 


for appropriate choices of cube roots. To understand this formula, Bombelli noted 
that (2+ i)? = 2+ 1li and (2—i)? =2—11i. Hence the cube roots in the above 
formula are 2 + i and 2 — i, and their sum is clearly 4. 

In Exercise 7 below, you will find the other two roots of the equation and explain 
how Cardan’s formulas give these two roots. <p> 


From the point of view of Cardan’s formulas, complex numbers are unavoidable 
when A > 0. But is it possible that there are other ways of expressing the roots which 
only involve real radicals? In Chapter 8 we will prove that when an irreducible cubic 
has real roots, the answer to this question is no—using Galois theory, we will see 
that complex numbers are in fact unavoidable when trying to express the roots of an 
irreducible cubic with positive discriminant in terms of radicals. 


B. Trigonometric Solution of the Cubic. Although complex numbers are 
unavoidable when applying Cardan’s formulas to a cubic with positive discriminant, 
there is a purely “real” solution provided we use trigonometric functions rather than 
radicals. This is the trigonometric solution of the cubic, due to Viéte. 

Our starting point is the trigonometric identity 


cos(30) = 4cos?0 — 3cos8, 


which you will prove in Exercise 8. If we write this as 4cos*6 — 3 cos —cos(36) = 
then ft) = cost 6 is aroot of the cubic equation r3 — 3t —cos(30) = 0. However, replacing 
6 with 6+ 22 gives the same cubic polynomial, since cos(3(6 + 2#)) = cos(36). It 
follows that th =cos(6 + 2r) j is another root of 413 — 3t — cos(36) = 0, and similarly, 
ts = cos(8 + 42) is also a root. 

In Exercise 9 you will show that the discriminant of 42? — 3t — cos(36) is given 
by sin ?(36). This is zero if and only if sin(3@) = 0, which in turn is equivalent to 
cos(38) = +1. Thus cos(30) # +1 implies that 4? — 3t — cos(39) has roots 


(1.25) ti =cos8, t)=cos(9+), 1 =cos(6+ *). 


Hence 41? — 3t — cos(30) = 0 is a cubic equation with known roots. Viéte’s insight 
was that by a simple change of variable, we can use this to solve any cubic equation 
with positive discriminant. Here is his result. 


Theorem 1.3.3 Let y> + py+q=0 be a cubic equation with real coefficients and 
positive discriminant. Then p <0, and the roots of the equation are 


y= 24/—Pcos8, y= 24/—Peos( (0+ 22) ), and y3 = 24/—Peos(#+ f), 
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where 0 is the real number defined by 


d= goosn! (Ge), 


Proof: You will prove this in Exercise 10. 7 
In Exercise 11 you will explore how this relates to Cardan’s formulas. 
Historical Notes 


When Cardan wrote Ars Magna in 1545, he and his contemporaries wanted to find 
real roots of cubic equations. In fact, they worked almost exclusively with positive 
roots, although they were aware of the existence of negative roots, which Cardan 
called “false” or “fictitious.” However, Cardan does use complex numbers in Chapter 
XXXVII when he considers the problem of dividing 10 into two parts so that their 
product is 40. In modern notation this gives the equations x + y = 10 and xy = 40. 
Eliminating y, we get the quadratic equation 


x? —10x+40=0 


with roots 5+i/15. After deriving this solution, Cardan says “Putting aside the 
mental tortures involved, multiply 5 + /—15 by 5 — /—15, making 25 —(—15)... 
Hence this product is 40.” Cardan’s conclusion is that “This truly is sophisticated” 
(2, pp. 219-220]. 

Cardan was also aware of Theorem 1.3.1, though he stated it in very different 
terms. As an example of a cubic with three real roots, he considers x3+9 = 12x, for 


which he gives the “true” (i.e., positive) solutions 3 and 4/54 ~ 15 and the “false” 


(i.e., negative) solution — 54 _ 1}. 

However, Cardan never applies his formulas to cubics like x7 +9 = 12x. He 
only considers cases where there is one real root, which can be expressed in terms 
of real radicals. Yet Cardan must have known that complex numbers appear in the 
radicals when the discriminant is positive. This is the casus irreducibilis (“irreducible 
case”) mentioned above. According to [1], Tartaglia was also aware of the casus 
irreducibilis, and in fact delayed publication of his results because he was so troubled 
by it. This is part of the reason why Cardan’s work appeared first. 

One of the first people to comment directly on the casus irreducibilis was Rafael 
Bombelli. In his book L’algebra, written around 1550 but not published until 1572, 
he treats this case in detail, including the formula 


(1.26) 4=V724+1li+ V2-1li 


from Example 1.3.2. There we saw how Bombelli explained this formula by showing 
that 2+ 11i= (2 +i), so that (1.26) reduces to 4 = (2+ i) +(2—i). Bombelli was 
pleased with this calculation and commented that 

At first, the thing [equation (1.26)] seemed to me to be based more on sophism 

than on truth, but I searched until I found a proof. 
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In working out this solution, Bombelli was the first to give systematic rules for 
adding and multiplying complex numbers. Exercise 12 will discuss another example 
of complex cube roots taken from Bombelli’s work. 

The moral is that cubic equations forced mathematicians to confront complex 
numbers. For quadratic equations, one could pretend that complex solutions don’t 
exist. But for a cubic with real roots, we’ve seen that Cardan’s formula must involve 
complex numbers. So it is impossible to ignore complex numbers in this case. See the 
books [1] and [3] for more background and discussion on the discovery of complex 
numbers. 

We should also say a few words about Viéte’s trigonometric solution of the cubic. 
Once we realize that cos(30) = 4cos? — 3cos@ gives a cubic equation with cos 
as a root, proving Theorem 1.3.3 is not that difficult. Viéte was well aware of 
such identities. For example, in 1593, Adrianus Romanus (also called Adriaen van 
Roomen) posed the problem of finding a root of the equation 


A= x8—45x84945x4!— 12300x79 + 111150x°7 —740259x°> 
+3764565x°? — 14945040x7! +. 46955700x79— 1176791 00x” 
(1.27) +236030652x7>— 37865800x7> + 483841 800x7! —488494125x!° 
+384942237x!7—232676280x!>+ 105306075x!?—3451207x!! 
+7811375x°—1138500x’+95634x°—3795x°+-45x, 


iV VeVi 


Viéte solved this equation by noting that 2sin(45q) can be expressed as a polynomial 
of degree 45 in 2sina whose coefficients match the right-hand side of (1.27). It 
follows that if A = 2sin(45q), then x = 2sina is a root. 

Viéte also realized that (1.28) can be written 


where 


(1.28) 


A = 2sin(x/15) = 2sin(45 - 7/675), 


which easily implies that one root of (1.27) is x = 2sin(1/675). Using the trick of 
(1.25), we get the 44 additional solutions 


20 
=2 (= =), i= 1,...,44. 
r= 2sin\ eet gs)) Jade 
Viéte listed only 23 roots, since he (like Cardan) wanted positive solutions. Never- 
theless, Viéte’s insight is impressive, and his solution of (1.27) makes it clear how he 
was able to find the trigonometric solution of the cubic. 
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Exercises for Section 1.3 


Exercise 1. Let f(y) = y’ + py +4 = (y—y1)(y—y2)(y — ys), and set 
A= (yi —y2)"(91 — ¥3)? (92 — ys)”. 


The goal of this exercise is to give a different proof of (1.22). 

(a) Use the product rule to show that f(y:) = (y1 — y2)(v1 — y3), where f’ denotes the 
derivative of f. Also derive similar formulas for f’(y2) and f’(y3). 

(b) Conclude that A = — f’(y1) f’(y2) f’(y3). Be sure to explain where the minus sign comes 
from. 

(c) The quadratic f’(y) = 3y’ + p factors vt 2 ) = 3(y—a@)(y— 8), where a = ,/—p/3 and 
B=—/—p/3 (when p > 0, we let \/—p/3 = i/ p/3). Prove that A = —27 f(a) f (8). 


(d) Use f(y) = y? + py+qand a= oh to show that 


f(a) = (/—p/3) + pV —p/3. +4 = (2/3) pV —p/3 +4. 


Similarly, show that f(3) = —(2/3)p,\/—p/3 +4. 
(e) By combining parts (c) and (d), conclude that A = —4p® — 274’. 


Exercise 2. Let f(y) = y’ + py+q. The purpose of Exercises 2-5 is to prove Theorem 1.3.1 
geometrically using curve graphing techniques. The proof breaks up into three cases corre- 
sponding to p > 0, p = 0, and p < 0. This exercise will consider the case p > 0. 

(a) Explain why A < 0. 

(b) Analyze the sign of f’(y), and show that f(y) is always increasing. 

(c) Explain why f(y) has only one real root. 


Exercise 3. Next, consider the case p = 0. 
(a) Explain why A < 0. 
(b) Explain why f(y) has only one real root. 


Exercise 4. Finally, consider the case p < 0. In this case, f’(y) = 3y? +p has roots a = 

/—p/3 and 8 = —,/—p/3, which are real and distinct. 

(a) Show that the graph of f(y) has a local minimum at a and a local maximum at 6. Thus 
f(q@) is a local minimum value and f(§) is a local maximum value. Also show that 
f(a) < (8). 

(b) Explain why f(y) has three real roots if f(a) and f(f) have opposite signs and has one 
real root if they have the same sign. Illustrate your answer with a drawing of the three 
cases that can occur. 

(c) Conclude that f(y) has three real roots if and only if f(a) f(@) < 0. 

(d) Finally, use part (c) of Exercise | to show that the roots are all real if and only if A > 0. 


Exercise 5. Explain how Theorem 1.3.1 follows from Exercises 2, 3, and 4. Notice that 


the quantity f(a) f(8), which appeared earlier in part (c) of Exercise 1, arises naturally in 
Exercise 4. 


Exercise 6. Prove (1.24). 


Exercise 7. Example 1.3.2 expressed the root y = 4 of y? — 15y —4 in terms of Cardan’s 
formulas. Find the other two roots, and explain how Cardan’s formulas give these roots. 


Exercise 8. Derive the trigonometric identity cos(3@) = 4cos*@ — 3cos@ using cos(x+y) = 
cosxcosy — sinxsiny and cos”@ + sin’@ = 1. 
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Exercise 9. When divided by 4, 41° — 3t — cos(39) gives 1° — 31 — icos(30), which is monic. 
Show that the discriminant of this polynomial is 2 sin’(30). 
Exercise 10. The goal of this exercise is to prove Theorem 1.3.3. Let y>+py+q=0 bea 
cubic equation with positive discriminant. Consider the substitution y = Ar, which transforms 
the given equation into 72> + Apt +.q=0. 

(a) Show that Exercises 2 and 3 imply that p < 0. 

(b) The equation MP + Apt + q = 0 can be written as 


(SP) (G)-a 


Show that this coincides with 41? — 3¢ — cos(3@) = 0 if and only if 


[=p 3V3q 
A=2,/— and cos(3@é)= : 
3 (3) pop 


Note that ,/— p is real and nonzero by part (a). 
(c) Use A = —(4p* + 27g") > 0 to prove that 


| 3V34_| <1, 

2p/—P 

(d) Explain how part (c) implies that the second equation of part (b) can be solved for 6. Also 
show that A > 0 implies that cos(3@) 4 +1. 

(e) By (1.25), t = cosé, 2 = cos (8 + 2m), and #3 = cos (8 + ) are the three roots of 


d°t? + Apt +q = 0. Then show that the theorem follows by transforming this back to 
y = Xt via part (b). 


Exercise 11. Consider the equation 41? — 3r — cos(30) = 0, where cos(30) # +1. In (1.25), 
we expressed the roots in terms of trigonometric functions. In this exercise, you will study 
what happens when we use Cardan’s formulas. 
(a) Show that Cardan’s formulas give the root 


h= 5 V/e5(36) + isin(36) +5 / cos(38) — isin(36). 


(b) Explain why 3e'° = 4(cos + isin@) is a value of }3/cos(36) + isin(3@), and use this to 
show that f; is just cos @. 
(c) Similarly, show that Cardan’s formulas also give the roots t2 and f3 as predicted by (1.25). 


Exercise 12. Example 1.3.2 discusses Bombelli’s discovery that /2 + 11i=2+i. But not all 

cube roots can be expressed so simply. This exercise will show that 74+ V11i is not of the 

form a+ bli for a,b € Z. 

(a) Suppose that 4+ /11i = (a+ bV11i)° for some a,b € Z. Show that this implies that 
4 =a’ —33ab’ and 1 = 3a7b— 116°. 

(b) Show that the equations of part (a) imply that b = +] and al|4. Conclude that the equation 
44 V1li = (a+ bV11i) has no solutions with a,b € Z. 

(c) Find a cubic polynomial of the form x* + px+q with p,g € Z which has the number 
V44+ Vilit V4— V1lias a root. 

In contrast to ¥/2+11i = 2+i, Bombelli was not certain that /4+V/11i was a complex 

number. He calls 4+ V/11i “another sort of cubic radical.” Bombelli never deals with this 
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radical by itself, but rather considers the sum Vv 44VJVllitv4—-vI1 1i, which is a root of the 
cubic equation found in part (c). 


Exercise 13. Suppose that a quartic polynomial f = x* + bx? + cx? +dx+e in R{x] has distinct 
roots x) ,x2,x3,x4 € C. The discriminant of f is defined by the equation 


A = (x1 — x2)? (x1 — x3)? (a1 — x4)? (x2 — x3)? (x2 — x4)? (13 — 4)”. 


The theory developed in Chapter 2 will imply that A € R, and A # 0, since the x; are distinct. 
Adapt the proof of Theorem 1.3.1 to show that 


A<0 => x'4+ bx? +x? +dx+e=0has exactly two real roots. 


Exercise 14. In Section 1.1, we discussed the equation x? + 2x? + 10x = 20 considered by 
Fibonacci. 


(a) Show that this equation has precisely one real root. This is the root Fibonacci approxi- 
mated so well. 


(b) Use Cardan’s formulas and a calculator to work out numerically the three roots of this 
polynomial. 


Exercise 15. Use a calculator and Theorem 1.3.3 to compute the roots of the cubic equation 
y? —7y +3 = 0 to eight decimal places of accuracy. 
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CHAPTER 2 


SYMMETRIC POLYNOMIALS 


The goal of this chapter is to provide some tools needed for our study of Galois theory. 
The basic result is that any polynomial unchanged under all possible permutations 
of the variables can be expressed in terms of certain special polynomials called the 
elementary symmetric polynomials. After proving this, we will show how to compute 
with symmetric polynomials and discuss the discriminant mentioned in Chapter 1. 


2.1 POLYNOMIALS OF SEVERAL VARIABLES 


Galois theory often deals with polynomials of more than one variable, especially 
when studying the roots of a polynomial. This section will introduce polynomials of 
several variables and the elementary symmetric polynomials. 


A. The Polynomial Ring in n Variables. Let x;,...,x, be distinct formal 
symbols called variables. A polynomial in x,,...,x, with coefficients in a field F is 
a finite sum of terms, which are expressions of the form 


Cx xn", cinF, ay,...,@, >0in Z. 
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We call the product xt +++" a monomial, so that a term is an element of F times 
a monomial. A term is nonzero if the constant is nonzero. The total degree of a 
nonzero term cx} ---x" is the sum of its exponents a; + ---+ dp. 

We define F[x1,..., xn] to be the set of all polynomials in x, ... ,x, with coefficients 
in F. It is easy to see that F|x),...,x,] is a ring under addition and multiplication 
of polynomials. The total degree of a nonzero f € F|x;,...,Xn|, denoted deg(f), is 
the maximum of the total degrees of the nonzero terms of f. Since F is an integral 
domain, one can prove without difficulty that if f,g € F[x1,...,x,] are nonzero, then 


(2.1) deg(fg) = deg(f) + deg(g). 


It follows that F[x),...,x,] is an integral domain. Note that deg(0) is not defined. 
Since F [x )+++;X,] is an integral domain, we can define its field of fractions 


F(x1,...,4n) = {Z | f,8 €F lxi,..., xn], s #0}. 


This is the field of rational functions in n variables. Note that: 


e Square brackets, as in F[x;,...,x,], refer to polynomials. 
e Parentheses, as in F(x1,...,%n), refer to quotients of polynomials. 
A nonconstant polynomial in F[x;,...,x%,] is irreducible over F if it is not a 


product of polynomials of strictly smaller total degree. We can factor polynomials in 
F[x1,...,%n] into irreducibles as follows. 


Theorem 2.1.1 Let f € F[x1,...,%] be nonconstant. Then there are irreducible 
polynomials g1,...,8, © F|x,,.-.,%n] such that 


f= 81-8). 
Furthermore, if there is a second factorization of f into irreducibles 
f =hy--hy 
then r = s and the h;’s can be permuted so that each h; is a constant multiple of g;. 


Proof: See Corollary A.5.7 of Appendix A. 7 


In Section A.5, we define the general notion of a unique factorization domain, or 
UFD. In this terminology, Theorem 2.1.1 states that F[x,... Xn] is a UFD. 


A useful property of F[x1,...,x%,] is that evaluation is a ring homomorphism. 
Suppose that we have a field F, a ring R containing F, and elements a1,...,a@, € R. 
Then the evaluation map 


Fi[n,..-,%n] —R 
is defined by 


(2.2) fF (X1,0065%n) —> flay, ...,Qn) ER. 
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We have the following important result. 


Theorem 2.1.2 Given a field F, a ring R containing F, and a,...,Qn, € R, the 
evaluation map (2.2) is a ring homomorphism F |x),...,Xn] 2 R. a 


Proof: The proof is a tedious verification that 


where f +g and fg are the sum and product of polynomials f and g. a 


Once we fix the field F, the variables x),...,x, play two roles. At the beginning 
of the section, they were formal symbols used in the definition of polynomial. But 
each variable x; also has the ability to “take any value.” In other words, x1,...,%, 
can take arbitrary values in any ring R containing F. Be sure you understand how 
Theorem 2.1.2 makes this precise. 


B. The Elementary Symmetric Polynomials. How do the roots of a monic 
polynomial in x relate to its coefficients? To answer this question, we begin with 
cubic and quartic polynomials. Suppose that f = x? + a,x? + aox +a3 € F[x] has 
roots @1,@2,a@3 € F. Then 


f =(x-a@1)(x- a2) (x—- a3). 


If we multiply this out and compare coefficients, then the coefficients can be expressed 
in terms of the roots as 


a, = —(a, +0243), 
(2.3) a2 = AQ) + A103 + A203, 


a3 = —410203. 


(See also (1.21) in Section 1.2.) For n = 4, a similar computation shows that if 
f =x4 +a,x3 + anx? + a3x+ a4 € F(x] has roots a1,02,03,04 € F, then 


a, = —(a, +02 +0344), 
an = A102 + &103 +0104 + 0203 + A204 + 0304, 
a3 = —(010203 + 010204 + 10304 +2030), 


ag = & 20304. 


Up to sign, a; uses the sum of the roots, a2 takes the roots two a time, and a3 takes 
them three at a time. We generalize this pattern as follows. 
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Definition 2.1.3 Let x;,...,X, be variables over a field F. Then 


Oy =X te +X, 


02> ) XiX;, 


Isic jsn 


Oo; = S Xi Xin + * Xi, 5 


I<i<-<i-<n 


On = X1X2°+°Xn 
are the elementary symmetric polynomials. Thus 01,...,0n € F|x1,..-,Xn]- 


We will sometimes write o, = 0,(x),..-,Xn). The following identity is one of the 
key properties of the elementary symmetric polynomials. 


Proposition 2.1.4 Let x|,...,X, be variables over a field F. Then, given another 
variable x, we have 


(2.4) (x—x1)+++ (x — xq) =x" — ap x” $e (-1Jop x $e + (-1)" on. 


Proof: The proof follows by multiplying out the left-hand side of (2.4) and then 
computing the coefficient of each power of x. For example, the constant term is ob- 
viously the product of constant terms, namely (—x,)--- (—xn) = (—1)"on. Similarly, 
the coefficient of x"~! is easily seen to be —x, — ++» —X_, = —0}. 

For readers interested in the details of how this works in general, observe that we 
multiply out (x — x,)---(x—x,) as follows: 
e For each of the n factors x — x;, choose either x or —x;. 
e Take the product of these n choices. 
e Sum these products over all possible ways of making the 7 choices. 


It follows that the terms involving x"~" in (x —x1)---(x—x,) are those products 
where we chose x exactly n — r times in the first bullet. This means choosing —x; 
for the iyst, ignd, ..., i-th factors and choosing x for the remaining n — r factors. As 
described in the second bullet, the product of these choices is 


(—xi, )(—*i) io (—x;,)x"7" = (—1)’xi,- Gj, 


v 


When we sum over all possible ways of making the n choices (as described in the 
third bullet), it follows that the coefficient of x”~” in the left-hand side of (2.4) is 


(-1)’ > Xj,7°°X;, = (-1)'o,. 


1<i<+- <i, <n 


This completes the proof of the proposition. . 
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Proposition 2.1.4 has the following useful application. Suppose that a monic 
polynomial f =x" + ax" | 4 agx? 2 4. tag _1xt ay E Fx] has roots a1,...,Qn 
in a larger field L. This means that 


x*+ax"! tet tan 1X + an = (x —@1)---(x—a,). 


However, since evaluation is a ring homomorphism (Theorem 2.1.2), we can evaluate 
the identity (2.4) at x1 = @1,...,X_, = Gy to obtain 


(x— a1) +++(x— Gy) = x7 — 01 (Q1,...,0_) x71 4+: 
+ (-1)""!on_1 (a4, .-. On) X+ (—1)"on(a1,---,Qn)- 

These two formulas give the following corollary of Proposition 2.1.4. 

Corollary 2.1.5 Let f =x" +ayx"—! +agx"-? +---4+a,_1x+ a, be a monic poly- 
nomial of degree n > 0 with coefficients in a field F. If f has roots a,...,Q, ina 
larger field L, then the coefficients of f are expressed in terms of its roots as 

a, = (-1)"o,-(a1, tee 1 On) 
forr=1,...,n. ] 


Here is what happens when n = 3. 


Example 2.1.6 If x° + a,x? + apx+a3 has roots a1,a2,a3, then Corollary 2.1.5 
implies that 


Qq= —o1 (G1, 02,3) = —(a, +a2+ a3), 
a2 = 02(Q1, 02,03) = A172 +0103 + 0203, 
a3 = —03(01,Q2,03) = —Q)a203, 
in agreement with (2.3). <> 


Mathematical Notes 
There are two topics for us to discuss. 


= Ideals in a Polynomial Ring. The text makes it seem that F'[x,,...,x,] behaves 
like the one-variable case studied in Section A.1. However, once we start talking 
about ideals, some significant differences emerge. For example, Theorem A.1.17 
implies that Fx] is a PID. But as soon as the number of variables is two or more, not 
all ideals are principal. Exercise 1 will give a simple example. 

In fact, F[x,,...,%,] has a rich supply of ideals when n > 2. These are related 
to solutions of simultaneous sets of polynomial equations, which is the subject of 
algebraic geometry. See [2] for an introduction to this area of mathematics. 
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= Coefficients as Polynomials. There are other ways to think about polynomials in 
several variables. For example, we can regard f € F[x,,...,x,| as a polynomial in x, 
with coefficients in F [x1,...,%n—1], Le., 


m 


f= do pilei,---s%n—1) ahs pi € Flxy,...,X%n—1]- 
i=0 
This is expressed more formally as F[x),...,xn] = F[x1,..-,%n—1|[%n]. For instance, 


(2.4) takes place in F[x),...,Xn|[x]. See Exercise 2 for more examples. 

Exercises for Section 2.1 

Exercise 1. Show that (x,y) = {xg+ yh | g,h € F[x,y]} C Fx, ] is not a principal ideal in 
Fix, y]. 


Exercise 2. Express each of the following polynomials as a polynomial in y with coefficients 
that are polynomials in the remaining variables. 


(a) xy + 3y? —xy? + 3x42y" + Try’. 
(b) (y— (x1 +2)) (y— (1 +43) (y— (2 +4). 


Exercise 3. Given positive integers n and r with 1 <r <n, let (") be the number of ways of 


choosing r elements from a set with n elements. Recall that (") = aon: 


(a) Show that the polynomial a, is a sum of (") terms. 
(b) Show that o,(—a,...,—a) = (-1)"(")a’. 


r. 


(c) Let f = (x+a)". Use part (b) and Corollary 2.1.5 to prove that 


(xta)"= > (“Jax 


r=0 


where (j) = 1. This shows that the binomial theorem follows from Corollary 2.1.5. 


2.2 SYMMETRIC POLYNOMIALS 


We will consider polynomials in n variables x),...,x, over a field F. 
Definition 2.2.1 A polynomial f € F|x,,...,xn| is symmetric if 
Ff (Xo(1)1+++1%o(n)) = f (415 -++ sn) 
for all permutations o in the symmetric group Sp. 
A. The Fundamental Theorem. In Section 2.1, we defined the elementary 
symmetric polynomials o,,...,0,. To prove that these are symmetric in the above 


sense, consider the identity 


(x — x1) ++ (X— xq) =x" — yx! 4-0-4 (-1)J'o, x" +--+ 4+ (—1)"on 
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from Proposition 2.1.4. The product on the left-hand side is symmetric because 
permuting the x; simply permutes the factors. Comparing this with the right-hand 
side, it follows that ),...,0, are symmetric. 

Since 0),...,0, are symmetric, any polynomial in g),...,0, is also symmetric. 
The remarkable fact is that all symmetric polynomials arise in this way. More 
precisely, we have the following Fundamental Theorem of Symmetric Polynomials. 


Theorem 2.2.2 Any symmetric polynomial in F[x1,...,Xn]| can be written as a poly- 
nomial in o1,...,0, with coefficients in F. 


Proof: We will follow (with a few changes) the argument given by Gauss in 1816 
in his second proof of the Fundamental Theorem of Algebra. The proof will involve 
an inductive process which requires that we order monomials x{'--- x0" in X1,...,Xn- 
We will use graded lexicographic order, which is defined by 


Xfi <xPhe ean > ate ta, < by t---+by, 


Or ay t++++ +n, = by +++ +bp 
and aq; < b,, 

or a) +++ +a, = bi +--+ +d, 
and a, = b, and a; < by, 


(2.5) 


Or .... 


We also define x?! ++. > x7'.--% to mean x01 +x" < xPte en, 


To compare one monomial with another, one first computes the total degree of each 
monomial, and when these are equal, one checks the two monomials one exponent 
at a time, starting with x), to find the first which differs. For example, 


xfxdx3 < xtxdx} (smaller total degree), 
xtxdx3 > x}x2x3 (same total degree, equal x, exponent, 


greater x2 exponent). 


An important property of graded lexicographic order is that there are at most 
finitely many monomials x? . xn such that 


(2.6) oGnEee ote SHEERS ool for fixed @j,...,@n. 


This follows because (2.6) and (2.5) imply that a; + --- +a, > b, +::: +, (be sure 
you understand this). Since N = a; + --- +a, is fixed and b; > 0 for all i, we get the 
inequality 

N=a,+---+a, >b,+---+b, > d; 


for all i. Hence there are only N + 1 possibilities for each b;, which easily implies 
that (2.6) can hold for at most finitely many x . oxen, 

We can apply graded lexicographic order to a nonzero polynomial as follows. We 
saw in Section 2.1 that such a polynomial is a sum of nonzero terms, each of which is a 
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nonzeroelement of F times amonomial. Then the leading term is the greatest of these 
monomials—telative to (2.5)—times its coefficient. Thus any nonzero polynomial 
has a leading term. 

For example, the leading term of a2 = x,x2 +.x1x3 +--+ +1043 +---+2Xp,—1%Xn is 
x1X2. In other words, x1x2 > xjx; when i < j and (i,j) # (1,2). This follows by 
checking the exponent of x, (if i > 1) or x2 (if i= 1 and j > 2) in x;x;. You will 
generalize this in Exercise 1 by showing that x,x2---x, is the leading term of the rth 
elementary symmetric polynomial 


Oo, = ) Xj, Xin? + * Xi, 


1<i) <++<i-<n 


We are now ready to prove the theorem. Let f € F[x1,...,Xn] be symmetric and 
nonzero with leading term 


(2.7) Cxphe xem, 
We claim that 


(2.8) A, 2 G2 > ++ 2 An. 

To show this, suppose that a; < aj, for some 1 <i<n-—1. The symmetry of f 
implies that interchanging x; and x;;, gives the same polynomial. Since (2.7) is a 
term of f, it follows that 


Git] an 
i Xn 


(2.9) Cxp! xP x, 
is also a term of f. To compare this with (2.7), note that both monomials have the 
same total degree and the same exponents of x,,...,x;-1. However, x; has exponent 
aj+1 in (2.9) and exponent a; in (2.7). Then a;4, > a; implies that (2.9) is a term of f 
greater than (2.7) according to the order relation (2.5). Yet (2.7) is the leading term 
of f. This contradiction proves (2.8). 
Now consider 

(2.10) gH op ae 8. at 

This is a polynomial by (2.8). In Exercise 2, you will prove that the leading term of 
a product is the product of the leading terms. Since the leading term of ¢, is x1 ---x,, 
it follows that the leading term of g is 


xp (x00) (xe eng JI = (pe OP (+ Xn) 


(2.1 1) = agian tan aa tetany Sa ast tan we > On tan an 


a i 
This shows that f and cg have the same leading term. Hence f; = f —cg has a 
strictly smaller leading term according to the ordering defined in (2.5). Note that f; 
is symmetric, since f and g are. 
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Now repeat this process, starting with f; instead of f. Since f; is symmetric, it 
has a leading term with coefficient c, and exponents b, > --- > b,. As above, this 
will give an expression g, in the elementary symmetric polynomials such that f; and 
c1gi have the same leading term. It follows that 


fh=fi-cigi =f -—cg—cigi 
has a strictly smaller leading term. Continuing in this way, we get polynomials 
f, fi=f-eg, fp=f-—cg-cigi, fa=f—cg-—cigi —c2g2, -.., 


where at each stage the leading term gets strictly smaller according to the order 
defined in (2.5). This process will terminate if we find some m with f, = 0, for the 
zero polynomial has no leading term. If, on the other hand, we never had f,, = 0, 
then the above would give an infinite sequence of nonzero polynomials with strictly 
decreasing leading terms. But we showed above that there are only finitely many 
monomials strictly smaller than the leading term of f. Hence the above process must 
terminate. 
However, once we have f,, = 0 for some m, we obtain 


f =cg+cigy +-++4+Cm—18m-1 


since fn, = f — Cg — 181 —*--—Cm—18m—1- Each g; is a product of the o; to various 
powers, which proves that f is a polynomial in the elementary symmetric polynomi- 
als. This completes the proof. 7 


In Theorem 2.2.7 below, we will prove that the expression of f as a polynomial in 
O|,..-,Op 1S unique. 

The proof of Theorem 2.2.2 can be turned into an algorithm for writing a given 
symmetric polynomial in terms of the o;. For this purpose, we will use the notation 


ay a 
Att Gy! 


to denote the sum of all distinct monomials obtained from x{'---x?" by permuting 
X1,..+)Xn. Here are some simple examples. 


Example 2.2.3 One easily sees that 
> xix = xP xq + x3x1 


and 
2 2 2 2 2 2 2 
> 3 Xj X2 = ApH. + AQK] +A X3 + GX +.19H3 +-43X2. 


Also, >>, x7x2 has 12 terms instead of 24. This is because x7.x2 = x?x2x$x? forn =4. 


Switching the last two variables gives the same monomial, yet }_, 7x2 uses only the 
distinct monomials we get by permuting the variables. <p 


34 SYMMETRIC POLYNOMIALS 


If cxy'---x¢" is a term of a symmetric polynomial f € F[x;,...,x,], then 


(2.12) f=c5,x7'+--x3" + asum of terms involving monomials 


different from those in )7,, x}'---x;". 


Do you see how we used this fact in the proof of Theorem 2.2.2? 
Here is an example of how to write a symmetric polynomial in terms of the o;. 


Example 2.2.4 The polynomial in x; ,x2,x3,x4 given by 
f= yg Xi ¥d%3, 


has 24 terms and is symmetric. In the first chapter of his 1782 book Meditationes 
Algebraice [6], Edward Waring shows how to express f in terms of o;,02,03, 04. 
His method is similar to what we did in the proof of Theorem 2.2.2. In this case, we 
proceed as follows (you will supply the details in Exercise 3): 


Step 1. The leading term of f is x3x3x3 = x}x2x1x9, so that (2.10) becomes 
a} *03!g1-% a9 = 010203. 
Furthermore, one can use a computer to show that 
(2.13) o10203 = 1, x3x3x3 4390, 37 +39 xpxoxd + 8D 4 xPxoxgx. 
. 19203 = D4 Xp X53 4X] X2%3%q 4X4 X5X4 Xp Xgxarg. 
Hence 
3 2242 22 
fi = f — 010203 = —390, xp x2x3x%4 — 37 xp xgK5 — BDI XE AG xgXG. 
Step 2. The leading term of f; is —3x3x2x3x4, which gives 
i 
3-1 1-1 1-11 _ 2 3 22 
(2.14) Oy 0, 04 Og = OL O4 = Dog Xp AQH3H4 + 2D XL AZGAIXG. 
Thus 
= 2. 2,22 22 
fo = f — 010203 + 30,04 = —390, xp xQx5 — 2D xp xz Hx3X4. 
Step 3. For f2, we have —3x?x2x? as leading term. Since 


(2.15) OF = Soy xixexd +200 xpxdxaxa, 


we obtain 
fo=f—-ojo.034+ 30704 + 303 = AY xpxgx3x4. 


Step 4. The leading term of 3 is 4x?x3x3x4, and from 
(2.16) 0204 = Dig XiXQX3%4, 


we see that 
fa =f —-—o,02034+ 30704 + 303 — 40204 = 0. 
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Conclusion. Since /, = 0, the process terminates and we obtain the formula 
f =o0)\0203-— 30704 _ 304 + 40204 
expressing f in terms of the elementary symmetric polynomials. <p> 


In the exercises you will apply these methods to a variety of problems dealing with 
symmetric polynomials. For readers interested in doing more substantial problems, 
Section 2.3 will explain how to compute with symmetric polynomials using Maple 
and Mathematica. 


B. The Roots of a Polynomial. In Galois theory, symmetric polynomials are 
often evaluated at the roots a ,..., 0, of a polynomial f € F[x]. The following result 
will be crucial. 


Corollary 2.2.5 Let f € F[x] be a monic polynomial of degree n > 0 with roots 
Q,..-,;Q, in a larger field L. Then, given any symmetric polynomial p(x\,...,Xn) 
with coefficients in F, we have 


p(Q1,.--;On) € F. 


Proof: The evaluation map F[x,...,%,] + L defined by p> p(ay,...,Qn) is aring 
homomorphism by Theorem 2.1.2. 

Since p is symmetric in x,,...,X,, Theorem 2.2.2 implies that p is a polynomial 
in the o, with coefficients in F. Hence, when we evaluate at a1,...,@,, we see that 
p(q,.-.,Qn) is a polynomial in the o,(a1,...,@,) with coefficients in F. 

Corollary 2.1.5 tells us that o,(a1,...,Qn) is, up to sign, a coefficient of f. Since 
f € Fx] by hypothesis, we conclude that 0,(a1,...,Qn) € F. The corollary now 
follows immediately from the previous paragraph. 2 


Here is an example of how Corollary 2.2.5 works. 


Example 2.2.6 Suppose that f = x? +2x7+x+7€ Q|x] has roots a1,02,03 € C. 
Let g be the monic polynomial whose roots are a; + a2, a; + a3, and a2 +03. We 
claim that g has coefficients in Q. To prove this, note that g can be written 
g(x) = (x- (ay + a2)) (x— (ay + 03)) (x- (a2 + 03)) 
= x3 — (2a; + 202+ 20:3) x? 
+ (a4 + as + a + 30102 + 30103 + 30203)x 
— (a1 + a2) (a1 +.03)(a2 +03). 


(2.17) 


The coefficients of (2.17) are symmetric polynomials evaluated at a;,a2,03. Since 
the a; are the roots of a polynomial with coefficients in Q, Corollary 2.2.5 implies 
that the coefficients of g are in Q. Hence g € Q|x]. 

We can also determine g explicitly. In general, if f = x? + bx? + cx +d has roots 
1, 02,03, and g is the polynomial with roots a; + a2, a; +03, and a2 +43, as 
defined in (2.17), then the techniques of this section imply that 


g(x) =x° +2bx’? + (b? +¢)x+be—d 
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(see Exercise 4 for the details). For f = x? + 2x? +x +7, it follows that 
g(x) = xP 4+2-2x74 (274 1)x42-1-7= 29 +42? + 5x—5 


is the polynomial whose roots are the sums of distinct pairs of roots of f. <> 


C. Uniqueness. Every symmetric polynomial in x;,...,x, can be written in terms 
of 01,...,0, by Theorem 2.2.2. We now prove that this expression is unique. 


Theorem 2.2.7 A given symmetric polynomial can be expressed as a polynomial in 
the elementary symmetric polynomials in only one way. 


Proof: We will use the polynomial ring F[u),...,u,], where u1,...,U, ate new 
variables. By Theorem 2.1.2, the map sending u; to o; € F[x1,...,%,] defines a ring 
homomorphism 

yp: F[uy,..-,Un] —> F[xy,...,xn]. 


In other words, if h = h(uj,...,u,) is a polynomial in u,,...,u, with coefficients in 
F, then p(h) = h(o1,...,0n). 

The image of ¢ is the set of all polynomials in the o; with coefficients in F. We 
denote this image by 


Floy,..-,On] C Flxy,... xn]. 
Note that F[o1,...,0,] is a subring of F[x),...,x,]. In this notation, we can write 
as a map 
(2.18) p:Flu,...,Un] —> Floy,...,on]- 


This map is onto by the definition of F[o),...,0,], and uniqueness will be proved by 
showing that y is one-to-one. Be sure you understand this. 

To prove that y is one-to-one, it suffices to show that its kernel is {0}. Thus 
we must show that if h is a nonzero polynomial in the u;, then h(o,...,0,) gives a 
nonzero polynomial in the x;. We will sketch the main idea of the argument and leave 
the details for Exercise 5. 

Let cu?'.--u>= be a nonzero term of h. Applying y gives co?'---o*, and the 
argument of (2.11) shows that the leading term of this polynomial is 


byte+by bate +b b 
CX, "Xp ane Sin 


Since h is the sum of its terms, y(h) is the sum of the corresponding polynomials 
c a? -+-a", each of which has a leading term as displayed above. The crucial fact is 


that the map 
(by, b2,...,Bn) > (by +--+ +bn,b2 +---+Bn,..-, Bn) 


is one-to-one, so that the leading terms can’t all cancel. Hence y(h) can’t be the zero 
polynomial, and uniqueness follows. See Exercise 5 for the details. . 
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The proof of Theorem 2.2.7 constructs a ring isomorphism 


(2.19) (pp: Fluy,...,un| & Floy,...,onl, 
where F'[u1,..., Un] is the polynomial ring in variables u),...,u, and uj; 0;. Hence 
we can regard o1,...,0, aS independent variables. This leads to the following 
interesting application. 

Using the above variables uj,...,u,, we call 
(2.20) fax" — yx tee (-1) ge t+ (— 1)" 


the universal polynomial of degree n (the reason for the signs will soon become 
clear). This name is justified because if f = x” + ajx"—! + +--+ @q—\x+ Qn € F[x] 
is any monic polynomial of degree n, then the evaluation map sending u; to (—1)!a; 
takes f to f. Thus the universal polynomial of degree n can be mapped to any monic 
polynomial of degree n with coefficients in F. 

We can construct the roots of f as follows. Under the isomorphism (2.19), the 
polynomial (2.20) maps to x” — ojx""! +---+(-1)"o,. But F[oi,...,on] lies in the 
larger ring F[x;,...,x,], and in this ring, (2.4) gives the factorization 


x” ox ee 4-1) gy et (—1)" on = (x — 211) ++ (x — an). 


In other words, x” — o,x"—'! +. ---+ (—1)"op has roots x1,...,Xn- 
Because of this, we identify (2.20) with its image under (2.19) and call 


(2.21) fax" on™ 4 eet (-1)" one t+ (-1)"on 


the universal polynomial of degree n. Then f is not only universal in the above sense 
but also has known roots, namely x1,...,Xn- 

As mentioned in the Historical Notes to Section 1.2, Lagrange studied the roots of a 
polynomial without regard to their numerical value. For a monic polynomial of degree 
n > 0, this means considering its roots as variables x;,...,%n. The above discussion 
shows that in modern terms, Lagrange was studying the universal polynomial f. 


Mathematical Notes 
Let us discuss further two ideas that appeared in this section. 


« Algebraic Independence. The uniqueness proved in Theorem 2.2.7 implies in 
particular that the map (2.19) is one-to-one. Hence there are no nontrivial polynomial 
relations among the o; (since any such relation would give a nonzero element in the 
kernel). When this happens, we say that o1,...,0, are algebraically independent. 
Not all collections of polynomials in F[x,,...,x,]| are algebraically independent. See 
Exercise 6 for an example. 


« Symmetric Rational Functions. The polynomial ring F[x,,...,x,] sits inside 
F(x,,...,%n), the field of rational functions in x;,...,%, with coefficients in F. In 
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this situation, one can ask which elements of F(x,...,%,) are symmetric, i.e., are 
unchanged under all permutations of the variables. An example is 


Using a common denominator, one can express this as 


XQ Xn be My tet na Tn 

X19 Xn _ On . 
More generally, one can show that any symmetric rational function in x),...,%n 
is a rational function in the elementary symmetric polynomials. In other words, 
all symmetric elements of F(x),...,X,) lie in the subfield F(o1,...,0,) of rational 
functions in 0),...,@,. This will be proved in Exercises 7 and 8. 


In Chapters 6, 7, and 8, we will study 
F(01,...,0n) C F(x1,.-- Xn) 


from the point of view of Galois theory. We will see that the Galois group of this 
field extension is the symmetric group S,. This in turn will enable us to determine 
when one can solve polynomials of degree n by radicals. 


Historical Notes 


Symmetric polynomials have been around for a long time. In 1629 Albert Girard 
published Invention nouvelle en l’algebre, which contains a clear description of the 
elementary symmetric polynomials. Girard also considers the power sums 


Sp= xp tee 4x7. 


In the notation used above, note that s, = >>, xj. Girard gives formulas for 51, 52,53, 54 
in terms of the o; (see Exercise 17). 

In 1665-1666 Isaac Newton worked out many examples of symmetric polynomi- 
als, expressing them in terms of the o;. His 1707 book Arithmetica universalis shows 
how power sums relate to elementary symmetric polynomials. For r = 1, the relation 
is trivial, namely s; = 0), and for r > 1, we have the Newton identities, which state 
that 


Sp = O1Sp—1 — 028,-2 +++ +(—1)"!ra, ifl<r<n, 


(2.22) ; 
Sp = O1Sp—1 — O2Sp-2 $+ + (= 1) On Sp ifr>n. 


Proofs of these identities can be found in [2, Ch. 7, §1], [3, pp. 62-63, 72-73], and 
(7, pp. 114-115]. 

As already noted, Waring’s Meditationes algebraice from 1782 contains an im- 
plicit version of the algorithm used in the proof of Theorem 2.2.2, though in examples 
he often used clever shortcuts. The Fundamental Theorem of Symmetric Polynomi- 
als was widely known and used in the eighteenth century, though the first complete 
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proof is due to Gauss. He was also the first to raise the issue of uniqueness, and his 
proof is the one we used. 

One difference between Gauss’s proof of Theorem 2.2.2 and ours is that he ordered 
his polynomials differently. In [Gauss, Vol. III, p. 36], he says: 


Dein e duobus terminis 
Ma%b'c?--- et Ma® b® c™ «-. 
priori ordinem altiorem tribuemus quam posteriori, si fit 
vela >a’, vela=a', 8 > B’ ,vela=a’', B=B'»,y> 7, velete. 


i.e. sie differentiis a — a’, 8 — 8’, y—7’ etc. prima, quae non evanescit, positiva 
evadit. 
Even though this is in Latin, the meaning is quite clear once one realizes that “vel ... 
vel... vel” means “either... or... or.” This is now called lexicographic order. In 
Exercise 9 you will use this order to prove Theorem 2.2.2. 

Although our interest in symmetric polynomials is due to their importance in 
Galois theory, these polynomials also arise naturally in invariant theory, algebraic 
combinatorics, and representation theory. A basic reference for symmetric polyno- 
mials is [4]. See also [Tignol, Chs. 4, 8] for the history of symmetric polynomials. 


Exercises for Section 2.2 


Exercise 1. Show that the leading term of 0; is x1x2-+--X,. 


Exercise 2. This exercise will study the order relation defined in (2.5). Given an exponent 
vector & = (a1,...,@n), where each a; > 0 is an integer, let x* denote the monomial 


a lay an 
5 a Se 


If a and f are exponent vectors, note that x*x° = x°**. Also, the leading term of a nonzero 
polynomial f € F[x1,...,%n] will be denoted LT(f). 

(a) Suppose that x° > x9, and let x7 be any monomial. Prove that x°*7 > x®*7, 

(b) Suppose that x* > x9 and x7 > x°. Prove that x*77 > x?*?, 

(c) Let f,g € F[x1,...,%,] be nonzero. Prove that LT(fg) = LT(f)LT(g). 


Exercise 3. Prove (2.13)-(2.16). For (2.13), a computer will be helpful; the others can be 
proved by hand using the identity 


(vibe + Im)? = He +I FAD cj IW 


Exercise 4. Let f = x? + bx? +cx+d € F[x] have roots a1,02,03 in a field L containing F, 
and let g be the polynomial defined in (2.17). Show carefully that 


g(x) =x? +2bx° + (b> 4+c)xt+be-d. 


Exercise 5. This exercise will complete the proof of Theorem 2.2.7. Leth € F[i,..., Un] be 
a nonzero polynomial. The goal is to prove that h(o1,...,¢,) is not the zero polynomial in 
X1y.--)Xn- 
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(a) Ifcuy .--u®* is aterm of A, then use Exercise 2 to show that the leading term of caf! gn 


is cx ttn bo tet bn bn, 

(b) Show that (b;,...,n) > (bi +--+ +bn,b2 + +++ +bn,...,bn) is one-to-one. 

(c) To see why A(o1,...,0n) is nonzero, consider the term of A(ui,...,un) for which the 
leading term of coy" .--o°* is maximal. Prove that this leading term is in fact the leading 


term of h(o1,...,0n), and explain how this proves what we want. 


Exercise 6. Here is an example of polynomials which are not algebraically independent. 
Consider x?,x1x2,x3 € F[x1,x2], and let @ : F[u1,u2,u3] 3 F[x1,x2] be defined by 


our) =x, O(uz) =x1x2, (us) =x}. 


Show that ¢ is not one-to-one by finding a nonzero polynomial A € F[u1,u2,u3] such that 
(hk) = 0. (Using the notion of transcendence degree, one can show that any collection of 
three or more elements in F [x1,x2] is algebraically dependent. See, for example, [Jacobson, 
Vol. IT, Sec. 8.12].) 


Exercise 7. Given a polynomial f € F[x,...,x,] and a permutation o € S,,, let o- f denote the 
polynomial obtained from f by permuting the variables according to 0. Show that [],. 5,0 °F 
and oo s, 7° f are symmetric polynomials. 


Exercise 8. In this exercise, you will prove that if y € F(x,...,x.) is symmetric, then 
is a rational function in o1,...,0, with coefficients in F. To begin the proof, we know that 
y = A/B, where A and B are in F[x1,...,x,]. Note that A and B need not be symmetric; only 
their quotient » = A/B is. Let 

C= II o-B, 


oES,\{e} 
where we are using the notation of Exercise 7. 
(a) Use Exercise 7 to show that BC is a symmetric polynomial. 
(b) Then use the symmetry of y = A/B to show that AC is a symmetric polynomial. 
(c) Use y = (AC)/(BC) and Theorem 2.2.2 to conclude that ¢ is a rational function in the 
elementary symmetric polynomials with coefficients in F. 


Exercise 9. In the Historical Notes, we gave Gauss’s definition of lexicographic order. 

(a) Give a definition (in English) of lexicographic order. 

(b) Inthe proof of Theorem 2.2.2, we showed that graded lexicographic order has the property 
that there are only finitely many monomials less than a given monomial. In contrast, this 
property fails for lexicographic order. Give an explicit example to illustrate this. 

(c) In spite of part (b), lexicographic order does have an interesting finiteness property. 
Namely, prove that there is no infinite sequence of polynomials fi, f2, f3,... that have 
strictly decreasing leading terms according to lexicographic order. 

(d) Explain how part (c) allows one to prove Theorem 2.2.2 using lexicographic order. 

Besides graded lexicographic order and lexicographic order, there are many other ways to 
order monomials. See [2, Ch. 2, §2]. 


Exercise 10. Apply the proof of Theorem 2.2.2 to express >_>, x?x2 in terms of 01,02,03. 


Exercise 11. Let the roots of y? + 2y? — 3y +5 be a, 8,7 € C. Find polynomials with integer 
coefficients that have the following roots: 

(a) a8, ay, and B+. 

(b) a+ 1, 8+1, andy+1. 

(c) a’, 87, andy’. 
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Exercise 12. Consider the symmetric polynomial f = 57, xi!--+x#". 

(a) Prove that f has n! terms when a1,...,@n are distinct. 

(b) (More challenging) Suppose that the exponents a),...,@, break up into 7 disjoint groups 
so that exponents within the same group are equal, but exponents from different groups are 
unequal. Let 2; denote the number of elements in the ith group, so that 2; +---+ 2 =n. 
Prove that the number of terms in f is 

n! 


Ole BL 


For example, f = ),x}x3x3x4x5 has €; = £2 = 2 and £; = 1. It follows that f has 
5!/(2!2!1!) = 30 terms. 


Exercises 13-16 will discuss some classic tricks for dealing with symmetric polynomials. A 
polynomial g € F[x1,...,Xn] is homogeneous of total degree d if every nonzero term of g has 
total degree d. 


Exercise 13. Let 21,92 € F[xi,...,Xn] be homogeneous of total degrees d),d2. 
(a) Show that g1g2 is homogeneous of total degree d; + d2. 
(b) When is g; + g2 homogeneous? 


Exercise 14. We define the weight of of! ---a," to be a; + 2a7 + 3a3 + +++ + nan. 
(a) Prove that of! ---o%" is homogeneous and that its weight is the same as its total degree 
when considered as a polynomial in x),... Xn. 
(b) Let f € F[x1,...,xn] be symmetric and homogeneous of total degree d. Show that f is a 
linear combination of products of) ---o7" of weight d. 


Exercise 15. Given a polynomial f € F[x1,...,xn], let deg,(f) be the maximal exponent of x; 
which appears in f. Thus f = x}x2 +.x1x3 has deg, (f) = 3 and deg, (f) = 4. 

(a) If f is symmetric, explain why the deg,(f) are the same for i= 1,...,n. 

(b) Show that deg,(of! ---0%") =a; t+a2+---+an fori=l,...,n. 


Exercise 16. This exercise is based on [7, pp. 110-112] and will express the discriminant 
A = (x1 — x2)? (x1 — x3)? (x2 — x3)? in terms of the elementary symmetric functions without 
using a computer. We will use the terminology of Exercises 14 and 15. Note that A is 
homogeneous of total degree 6 and deg,(A) = 4 for i = 1,2,3. 

(a) Find all products of!'05?o3? of weight 6 and deg,(of!o3?03?) < 4. 

(b) Explain how part (a) implies that there are constants £,...,@5 such that 


A= £03 + £010203 + bozos + £03 + bsoros. 


(c) We will compute the 2; by using the universal property of the elementary symmetric 
polynomials. For example, to determine @, use the cube roots of unity 1,w,w” to show 
that x? — 1 has discriminant —27. By applying the ring homomorphism defined by x1 4 1, 
Hw, 13 w? to part (b), conclude that 2; = —27. 

(d) Show that x? — x has roots 0, +1 and discriminant 4. By adapting the argument of part (c), 
conclude that 24 = —4. 

(e) Similarly, use x? — 2x” +x to show that £5 = 1. 

(f) Next, note that x? — 2x? —x+2 has roots +1,2, and use this (together with the known 
values of £,, £4, £5) to conclude that £2 — 423 = 34. 

(g) Finally, use x? — 3x? + 3x — 1 to show £2 +383 =6. Using part (f), this implies 2) = 18, 
£3 = —4 and gives the usual formula for A. 

Other examples illustrating this method can be found in [1, pp. 442-444]. 
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Exercises 17-20 will study power sums s, = xj +---+.2x/, and the Newton identities (2.22) 
discussed in the Historical Notes. 


Exercise 17. Use the Newton identities (2.22) to express the power sums 52, 53,54 in terms of 
the elementary symmetric polynomials 0), 02,03, 04. 


Exercise 18. Suppose that complex numbers «a, 3,7 satisfy the equations 


a+B+y7=3, 
+P +=, 
+H4+y= 12. 
Show that a” + 8" +7" € Z for all n > 4. Also compute a* + 644.4. 


Exercise 19. Suppose that F is a field of characteristic 0. 
(a) Use the Newton identities (2.22) and Theorem 2.2.2 to prove that every symmetric 
polynomial in F[xi,...,x,] can be expressed as a polynomial in s1,...,5n- 
(b) Show how to express o4 € F[x1,x2,.x3,x4] as a polynomial in s1, 52,53, 54. 


Exercise 20. Let F2 be the field with two elements. Show that in F2[x1,..., xn], itis impossible 
to express o2 as a polynomial in s;,...,5, when n > 2. 


2.3 COMPUTING WITH SYMMETRIC POLYNOMIALS (OPTIONAL) 


The method described in Section 2.2 for expressing a given symmetric polynomial 
in terms of o,...,0, is useful for simple problems, but can be cumbersome in more 
complicated situations. Fortunately, computer algebra programs such as Maple or 
Mathematica make it relatively easy to represent symmetric functions in terms of 
the elementary symmetric polynomials. We will discuss briefly how these powerful 
programs can be used to manipulate symmetric polynomials. (One can use Maple 
and Mathematica in other parts of Galois theory as well—see [Swallow].) 

Although few readers will have access to both Maple and Mathematica, we suggest 
reading both discussions in order to better appreciate the underlying ideas. 


A. Using Mathematica. We begin by using Mathematica to write the discriminant 
A = (x1 — x2)? (x1 — x3)" (x2 — x3)? 


from Section 1.2 in terms of the elementary symmetric polynomials. We can think 
of this in terms of the system of equations 


2 2 2 
A = (x1 — x2) (x1 — x3) (x2 — x3) ; 
Oy = Xp t%24+-X3 
(2.23) , 
O27 = XyX2+X1x3 4+ X2%3, 
O53 = X1X2X3. 
The idea is to eliminate x,,x2,x3 from these equations. This will give the desired 
expression for A in terms of 01,02, ¢3. 
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We tell Mathematica to do this elimination using the command 


Eliminate| {Delta==(x1-x2)°2(x1-x3)°2(x2-x3)*2, e1== 
X1+X2+x3, e2==x1x2t+x1 x3+x2x3, e3 ==x1x2x3}, {x1,x2,x3}] 


The output is 
—e17e2? + 4e1%e3 — 18e1e2e3== — Delta — 4e2° — 27e37, 
which tells us that 
A = —403 — 2703 + a203 — 40303 + 18010203. 


This agrees with (1.20) from Section 1.2 after the substitution b = —o,, c = 02, and 
d= —03. 

Using the Eliminate command is straightforward, though having to enter the 
elementary symmetric polynomials by hand can be time-consuming, especially when 
the number of variables is large. This can be avoided by using the Mathematica 
package SymmetricPolynomials that comes with the program. This package is 
loaded by 


<< Algebra‘SymmetricPolynomials‘ 

Then the above computation can be done using the command 
SymmetricReduction|(x1-x2)*2(x1-x3)*2(x2-x3)°2, 
{x1,x2,x3}, {e1,e2, e3}| 

The output is the two-element list 

{e17e2? — 4e2° — 4e13e3 + 18e1e2e3 — 27e37,0} 


where the second element, 0, tells us that A is in fact symmetric, and the first element 
is the polynomial expressing A in terms of the o;. To go directly to the first element 
of this list, one could give the command 


SymmetricReduction|(x1-x2)*2(x1-x3)*2(x2-x3)°2, 
{x1,x2,x3}, {e1,e2,e3}] [[1]] 
since in general Mathematica uses “|[i]|” to extract the ith element of a list. 


Here is an example which illustrates one of the interesting things which can be 
done with symmetric polynomials. 


Example 2.3.1 Let a1,@2,03 € C be the roots of y? + 2y* —3y +5. Our goal is to 
use Mathematica to find the cubic polynomial whose roots are a7,a3,03. 
Let y;, 2,3 be variables and define the polynomial f in Mathematica to be 


f = (y —y1°3)(y — y2°3) (y — y3°3) 
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Note that the evaluation y; > a; takes f to the polynomial we want. If we multiply 
out f, we get a polynomial whose coefficients are symmetric in y;,y2,y3. We express 
these in terms of the elementary symmetric polynomials using the Mathematica 
command 


Do[Print[SymmetricReduction|Coefficient[f,y, i], 
{y1,y2,y3}, {e1,e2,e3}] [[1]]], {i,0,2}] 


This instructs Mathematica to print out the coefficients of f expressed in terms of the 
elementary symmetric polynomials, here denoted e1,e2,e3. The output is 


constant term : —e3°, 
(2.24) coefficient of y : e27?—3e1e2e34+363?, 
coefficient of y? : -e19+3 e1 e2—363. 


The evaluation y; > a; sends e1 +> —2, e2+5 —3, e3 +5 —5. Using (2.24), we see 
that y?+ 4ly?+ 138y+ 125 is the polynomial with roots a3, a3, a3. <p> 


The formulas (2.24) imply that for any cubic polynomial, we can find a cubic 
polynomial! whose roots are the cubes of the given one. This is part of the universal 
aspect of the elementary symmetric polynomials. 


B. Using Maple. As above, our first Maple computation wil! be to express the 
discriminant A = (x; —x2)?(x; ~x3)?(x2 — x3)” in terms of o1,02,03. We will again 
use the equations (2.23) to eliminate x) ,x2,x3, which will give the desired expression 
for A. 

To do this in Maple, we proceed as follows. The last three lines of (2.23) give the 
polynomials 


(2.25) €) —X1 —X2— 3, CQ —HyXQ— HyA3—AQX3, 3 — HHXG 

in C[x1,x2,%3,€1,€2,€3]. These generate an ideal in this ring, and we eliminate 
X1,X2,x3 from A by replacing all instances of x1 + x2 +23, XyX2 +413 +.%2%3, X1x2%3 
with e1, e2, e3, respectively. This operation can be thought of as the normal form of 
A with respect to the ideal generated by (2.25). 


The first step is to load the Maple package Groebner, which contains the com- 
mands we need. This is done by 


with(Groebner); 
We next tell Maple to order the monomials in C[x; ,x2,.x3,€1,€2,€3] via 
T := lexdeg([x1,x2,x3], [e1,e2,e3]); 


The monomial order T is specially designed for elimination and is more efficient 
than the graded lexicographic order used in the proof of Theorem 2.2.2. We need to 
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specify a monomial order because the precise definition of “normal form” depends 
on how the monomials are ordered. 

Once we have the monomial order T, we compute an intermediate object called a 
Grobner basis using the command 


GB:= Basis([e1-x1-x2-x3,e2-x1*x2-x1*x3-x2*x3, 
e3-x1*x2«x3],T): 
Note that Maple uses * for multiplication. We also used : to suppress output, since we 
don’t need to see the Grdébner basis. Roughly speaking, the Grdbner basis consists of 
polynomials that generate the same ideal as (2.25) and are optimized for the monomial 
order T. 
The final step is to compute the normal form. This is precisely what the Maple 

command NormalForm does: 

NormalForm((x1-x2)*2 * (x1-x3)°2 * (x2-x3)°2, GB,T); 
This gives the output 

—4%e19xe3+ 18e1 *e2%e3 — 27 *e37+ e1?* e2? 4% e2° 


which agrees with our earlier computation. 

The Mathematica command Eliminate described earlier uses a Grobner basis 
computation similar to what we did here. Grobner basis methods can be applied to a 
variety of elimination problems. The full details can be found in [2]. 

Notice that once the monomial order T is defined and the Grobner basis GB 
is computed, NormalForm can be used repeatedly to write a series of symmetric 
polynomials in terms of the elementary symmetric polynomials. 


Example 2.3.2 In Example 2.3.1, we used Mathematica to find a polynomial whose 
roots were the cubes of the roots a),02,03 of y?+ 2y* —3y+5. Let us redo this 
example using Maple. We first enter the polynomial 


£ := (y —y1°3) « (y — y2’3) * (y — y3°3); 
Then the Maple command 
for i from 0 to 2 do print(NormalForm(coeff(f,y,i),GB,T)) od; 


prints out the coefficients of f expressed in terms of the elementary symmetric 
polynomials. As in (2.24), the result is 


constantterm :—e3%, 
coefficient of y : —3 el e2 e3+3 e37+e2°, 
coefficient of y? : —3 e3+3e1 e2—e1°, 


and the evaluation e1 4 —2, e245 —3, e3 + —5 shows that the roots of y? +4ly? + 
138y+ 125 are a},03,03. <> 


Similar examples are given in the exercises. 
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Exercises for Section 2.3 
Exercise 1. Examples 2.3.1 and 2.3.2 showed that the roots of y> + 4]y” + 138y + 125 are the 
cubes of the roots of y? + 2y? — 3y+5. Verify this numerically. 


Exercise 2. Use the method of Example 2.3.1 or 2.3.2 to find the cubic polynomial whose 
roots are the fourth powers of the roots of the polynomial y? + 2y” —3y+5. 


Exercise 3. Express )_, x}.x? in terms of the elementary symmetric polynomials. This example 
was first done by Newton around 1665. 


Exercise 4. Given a cubic x° + bx? + cx +d, what condition must b,c,d satisfy in order that 
one root be the average of the other two? 


Exercise 5. Given a quartic x* + bx? + cx? + dx + e, what condition must b,c,d,e satisfy in 
order that one root be the negative of another? 


Exercise 6. Find the quartic polynomial whose roots are obtained by adding | to each of the 
roots of x* +. 3x? +4x+7. 


2.4 THE DISCRIMINANT 


Given n > 2 variables x),...,x, over a field F, the discriminant is 
A= {[ (x; —x;)? € F[x1,...,%n]- 
1<i<j<n 


There are (5) = n(n — 1) factors in this product. Furthermore, since (x; — xj)? = 
—(x; — x;)(x; —x;), we can rewrite the above formula as 


A= (—1)2"-)) {[ (x; —x;). 
ifj 
I<Si,jgn 


This shows that if we permute the variables, then we still have the product of the 


differences of all distinct pairs of variables. Thus A is symmetric in x1,..., Xn. 
Theorem 2.2.2 implies that A can be written as a polynomial in the elementary 
symmetric polynomials ¢),...,0,. In other words, 


A €F[o,,...,on]. 
When n = 3, the formulas of Section 1.2 (or the methods of Section 2.3) imply that 
(2.26) A = —46} — 2703 + 763 — 40303 + 18010203. 


For general n an explicit formula for A in terms of o),...,0, will be given in the 
Mathematical Notes. 
The definition of A shows that it has a square root in F'[x;,...,x,]. We define 


VA= {[ (x; —x;). 


1<i<j<n 
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We next describe how VA transforms under permutations. 
Proposition 2.4.1 Ifa € S,, then 
ao VA =sgn(c) VA, 


where sgn(a) is the sign of o defined in (A.3), and o-\/A is the polynomial obtained 
from VA by permuting the variables x,,...,X%, according to o. 


Proof: In 1841, Jacobi studied how VA transforms under a transposition (i j). His 
argument, adapted to our notation, goes as follows. We can assume i < j. Then 
observe that there is ¢ € {+1,—1} such that 


(2.27) VA= € (x; — xj) Il (xj — Xx) (xj — Xx) Il (x1 —Xm)- 


kAi,j hing) 
<m 


This follows because the factors appearing in the right-hand side are, up to sign, the 
factors of V/A. For example, when k + i, j, then 


“-x Xi — Xk, i<k, 
y= 
‘ —(x,-—x;), k<i. 


Combining all of these signs gives ¢ = +1 in (2.27). Since the transposition (ij) 
takes (x; — xx) (xj — xx) to (xj — xx) (x; — Xx) and doesn’t affect x; — x, for l,m # i,j, 
we see that (2.27) implies that (ij) - VA =-VA. 

Now let o € S,, and write o as a product of transpositions, say o = 7 ---T¢. Then 


T° VA = -JVA implies that 
(2.28) o- VA =(n--- 12): VA = (-1)°VA. 
Since sgn(o) = (—1)* (be sure you understand why), the proposition follows. . 

We next define the discriminant of a monic polynomial 

fa=x" tax"! 4-.-tax"!+---+a, € Fl 
of degree n > 2. As in Section 2.2, the universal polynomial 
faxt—oyxt 44 (-Dlojx™ $+ (-1)"on 

maps to f via oj ++ (—1)‘a;. Since A is a symmetric polynomial, we can write 
(2.29) A=A(o}q,...,0i,-+-;0n) € Floy,..-, On]. 
Then we define the discriminant of f, denoted A(f), to be 
(2.30) A(f) = A(—ay,...,(—1)/aj,...,(—1)"an) € F. 
Thus the evaluation 0; 4 (—1)/a; takes f to f and A to A(f). 
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We also define A(f) = 1 when f has degree 1. This will be useful later. 
Example 2.4.2 Consider f = x3 + bx? +cex+d. We saw in (2.26) that 
A = A(o1, 02,03) = —403 — 2703 + 0703 — 40303 + 18010203. 
Since the evaluation is given by a; +> —b, 02 +4 c, and a3 + —d, we obtain 
A(f) = A(-b,¢,~d) 
= —4c3 — 27(—d)* + (—b)*c? — 4(—b)3(—d) + 18(—b)c(—d) 
= —4c3 — 27d? + b’c* — 4b7d + 18bed. 
This agrees with the formula (1.20) found in Section 1.2. <L 


In the case when we know the roots of a polynomial, we get the following formula 
for its discriminant. 


Proposition 2.4.3 Suppose that a monic polynomial f € F [x] of degree n > 2 has 


roots Q,...,Q, in a field L containing F. Then 
A(f)= [J] (ai-a;). 
I<i<j<n 
Proof: In F|x1,...,%n], we know that A = [Tj <je jen (Xi —xj;)*. Now consider the 


evaluation map that takes x; to aj. Since evaluation is a ring homomorphism, this 
takes A to 
(a; _- aj)’. 
l<i<j<n 
If we write A = A(oy,...,0;,.-.,0n) aS in (2.29), then x; > a; takes A to 
A(or(a1,.--Gn),---;0:(Q1,---On),--->On(Q1,---Gn)). 
By Corollary 2.1.5, this equals 
A(—ay,...,(—1)!a;,...,(—1)"ap) = A(f) 
by the definition of A(f). . 


Mathematical Notes 
There are several ideas in this section in need of further discussion. 


« The Action of the Symmetric Group. This chapter used the action of S, on the 
polynomial ring F[x),...,X,]. Foro € S, and f € F[x1,...,%n], 0 -f is the polynomial 
obtained by permuting the variables according to 7. This operation has the following 
properties: 


(2.31) o-(fg)= 
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for 0,7 € S, and f,g € F[x,,...,x,]. We have used these properties implicitly 
throughout the chapter, and we will give a formal proof of (2.31) in Chapter 6. The 
first two imply that f > o- f is a ring homomorphism from F|[x,,...,xp| to itself, 
and the last implies that (o,f) > o- f is a group action, as defined in Section A.4. 


= The Alternating Group. Let F be a field of characteristic different from 2. 
Proposition 2.4.1 implies that ¢- /A = sgn(o) VA. Since —1 4 +1 in F, 


o-VA=VA <> sgn(c) =1 <> o EAp. 


Thus the alternating group A, is the subgroup of permutations that fix VA. 
This leads to the question of which other polynomials or rational functions are 
fixed by A,. The answer is as follows. 


Theorem 2.4.4 Let F be a field of characteristic A 2. If f € F(x,...,Xn) is invariant 
under A,, then there are A,B € F(o1,...,0») such that 


f=A+BVA. 
Furthermore, f © F\x\,...,%n] implies that A,B € F{oy,...,0n]. . 
We will prove this theorem in Chapter 7. We will also explain how it relates to the 
Galois correspondence between subgroups of S, and subfields of F(x1,...,,) which 
contain F(o),...,0n). 


« The Existence of Roots. Our discussion of discriminants raises an interesting 
question about roots. Given a monic polynomial f, our definition of A(f) involves 
only the coefficients of f. However, if we know the roots of f, then we get the simpler 
formula given by Proposition 2.4.3. This brings up the fundamental question: Does 
every polynomial in F |x] have roots in a possibly larger field? We will answer this 
question in Chapter 3. 


« Discriminant Formulas. The polynomial expressing A in terms of o1,...,0, gets 
more and more complicated as n increases. But if we use determinants, then we get 
several compact ways to represent both A and VA. We begin with Vandermonde’s 
formula for VA. See [7, p. 56] for a proof. 


Proposition 2.4.5 
xn t xy! xn! 
xn? xe? xn? 
VA = det : : 
x x2 Xn 
1 1 1 


The determinant on the right is called a Vandermonde determinant. a 
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In Exercise 1, you will use Proposition 2.4.5 to derive the following formula for 
A in terms of the power sums s, = xf +-+-+.27: 


S2n—2 S2n-30 *** — Sn—-1 

S2n-3, S2n—-4 °° Sn—2 
(2.32) A = det . . 

Sn-1 Sn—2 SO 


In Section 5.3 we will discuss the relation between discriminants and resultants. 
This leads to a formula that uses the (2n — 1) x (2m — 1) matrix M defined by 


1 n 
—a) —(n—l)o, n 
a2 1 (n—2)a2 —(n—-l)oi 
eared (n—2)a2 n 
a2 —(n—l)oy 
(—1)"on-1 (-1)""on-1 (n—2)o2 
(—1)"on (-1)"'on-1 
(-1)""!on-1 : 
(-1)"on (-1)"'on-1 
n— 1 columns n columns 


where the empty spaces are filled in with zeros. In (5.13) we will see that 
A = (-1)?"—")/? det(M). 


This gives an explicit representation of A in terms of 0),...,0n. 
See [5] for further comments on computing discriminants. 


Historical Notes 


The discriminant A can be represented as a polynomial in x;,...,x, and also as 
a polynomial in the elementary symmetric polynomials o),...,0,. The second of 
these came first, since the discriminant for 2 = 3 is implicit in Cardan’s formulas 
from Chapter 1. By the 1770s, Lagrange and Vandermonde knew the properties of 
A and VA for small n. For example, when n = 4, Lagrange explicitly stated that a 
transposition changes the sign of V/A. 

The general form of the discriminant was defined independently by Cauchy in 
1815 and Gauss in 1816. Cauchy did this in his pioneering studies of the symmetric 
group S,. For him, a polynomial was symmetric if it was unchanged by transpositions, 
so that the next class of functions to study were those which changed sign under a 
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transposition. In essence, Cauchy proved that if F has characteristic different from 
2 and f € F[x,,...,x,| satisfies 7 - f = —f for all transpositions 7, then f = BVA 
for some B € F[oj,...,0n]. In Exercise 2 you will show that this follows from 
Theorem 2.4.4. 

Cauchy also considered determinants, drawing on earlier work of Vandermonde 
and Laplace. He proved Proposition 2.4.5, though he mistakenly attributed it to 
Vandermonde. In 1841 Jacobi gave the argument used to prove Proposition 2.4.1. 

Gauss studied the discriminant in his second proof of the Fundamental Theorem 
of Algebra. His discussion of A is surprisingly modern. Like us, he initially defines 
A as a polynomial in F[x;,...,x,] and then shows that it lies in F[o;,...,0,]. Using 
the isomorphism 

p:Fluy,...,Un] & Floi,..-,on] 


from (2.19) in Section 2.2, Gauss defines A(uj,...,un) € F[uy,...,4n] to be the 
polynomial such that A(oj,...,¢,) = A in F[xy,...,x,]. Finally, given a monic 
polynomial f € F |x], Gauss defines A(f) just as we did in (2.30). His notation and 
terminology are different, but his treatment is virtually identical to ours. 


Exercises for Section 2.4 


Exercise 1. Let M be the n x n matrix appearing on the right-hand side of the Vandermonde 
formula given in Proposition 2.4.5. Prove that (2.32) follows from the fact that M and its 
transpose both have determinant V/A. 


Exercise 2. Let F have characteristic # 2, and let f € F[x,...,xn] satisfy 7- f = —f for all 
transpositions 7 € S,. Prove that f = BVA for some B € F[o1,..., on]. 


Exercise 3. Let f = x? + bx+c¢ € Fx]. Use the definition of discriminant given in the text to 
show that A(f) = b? — 4c. 


Exercise 4. Let f € F [x] be monic, and assume that f = (x —a1)---(x—a,) in some field L 
containing F. Prove that A(f) 4 0 if and only if a),..., a, are distinct. This shows that f has 
distinct roots if and only if its discriminant is nonvanishing. 


Exercise 5. Show that VA € F[x,...,n] is symmetric if and only if F is a field of character- 
istic 2. 


Exercise 6. Exercise 5 showed how things can differ in a field of characteristic 2. Another 
example comes from the quadratic formula, which doesn’t apply over such fields because of 
the 2 in the denominator. This exercise will describe how to solve quadratic equations over a 
field F of characteristic 2. 

(a) Given b € F, we will assume there is a larger field F C L such that b = B for some B € L. 
Show that £ is unique and that { is the unique root of x? +b. Because of this, we denote 
B by Vb. 

(b) Now suppose that f = x? + ax-+b is a quadratic polynomial in F[x] witha 4 0. Suppose 
also that f is irreducible over F, so that it has no roots in F. We will see in Chapter 3 
that f has a root a ina field Z containing F. Prove that a cannot be written in the form 
a= u-+v/w where u,v,w € F. 

(c) Part (b) shows that solving a quadratic equation with nonzero x-coefficient requires more 
than square roots. We do this as follows. If b € F, let R(b) denote a root of x7 +.x+b 
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(possibly lying in some larger field). We call R(b) and R(b) + 1 the 2-roots.of b. Prove 
that the roots of x? +x +b are R(b) and R(b) + 1, and explain why adding | to the second 
2-root gives the first. Note that in characteristic # 2, square roots behave as follows: If 
one square root is 4/c, then we multiply by —1 to get the other square root —\/c. In 
characteristics 2, 2-roots work the same way, provided that we replace “multiply by — 1” 
with “add 1.” 

(d) Show that the roots of f = x? + ax+b,a #0, are aR(b/a’) and a(R(b/a”) + 1). 

It follows that when F has characteristic 2, then the roots of x° + ax+b € Fx] are 


_ vb, a=0, 
“= aR(b/a’), a(R(b/a’) +1), a0. 


This is the “quadratic formula” over a field of characteristic 2. 


Exercise 7. Explain how the third property of (2.31) was used (implicitly) in (2.28) in the 
proof of Proposition 2.4.1. 


Exercise 8. As explained on page 37, we can regard F[o1,...,0n] as a polynomial ring in the 
variables o,...,0n. In this exercise, you will prove that although A factors in F[x1,..., xn], it 
is irreducible in F[o1,...,0n] when F has characteristic different from 2. To begin the proof, 
assume that A = AB, where A, B € F[o1,...,0n] are nonconstant. 
(a) Using the definition of A and unique factorization in F[x,,...,Xn], show that A is divisible 
in F[x,...,%n] by x1 —x, forsome 1 <i< j<n. 
(b) Given 1 <i< j<nand1 <1 <m<~n, show that there is a permutation o € S, such that 
o(i) =lando(j) =m. 
(c) Use parts (a) and (b) to show that A is divisible by x; — x, for all 1 <i <m<n. 
(d) Conclude that A is a multiple of VA and that the same is true for B. 
(e) Show that part (d) implies that A and B are constant multiples of VA and explain why 
this contradicts A,B € F[o1,...,n]. 
(f) Finally, suppose that F has characteristic 2. Prove that A is not irreducible. 


Exercise 9. For n = 4, the variables x), x2,x3,x4 have discriminant 
2 2 2 2 2 2 
A = (x1 — x2) (x1 — 03)°(01 — x4) (x2 — 03) (x2 — 4) (03 — x4)”. 
Let yj = x1X2 +.%3%4, Y2 = X1X3 + X2X4, V3 = X1X4 + X2%3, and consider 


A(y) = (yy) (y — y2) (y — ys). 


This is a cubic polynomial in y. As in the text, the discriminant of 0 will be denoted A(@). 
Show that A(@) = A. When we discuss Lagrange’s work in Chapter 12, we will see that @ is 
the Ferrari resolvent, which plays an important role in the solution of the quartic equation. 


Exercise 10. Let C,D € F[o1,...,0n] be nonzero and relatively prime. This exercise will 
show that C and D remain relatively prime when regarded as elements of F[x1,..., Xa]. 
(a) Show that C”, D” are relatively prime in F[o1,...,0n] for any positive integer m. 
(b) Suppose that p € F[x1,...,xn] is a nonconstant polynomial dividing C and D. Prove that 
a: p divides C and D for all o € Sy. 
(c) As in Exercise 7 of Section 2.2, let P = Tees, o- p. Show that P divides Cc” and D", 
and then use part (a) and Exercise 7 of Section 2.2 to obtain a contradiction. 
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Exercise 11. Exercise 8 of Section 2.2 showed that if p € F(x1,...,%,) is symmetric, then 
y € F(o1,...,0n). In this exercise, you will refine this result as follows. Suppose that 
yp € F(x1,...,Xn) is symmetric, and write p = A/B, where A,B € F[x1,..., Xn] are relatively 
prime. The claim is that A, B are themselves symmetric and hence lie in F[o1,...,0n]. We can 
assume that A and B are nonzero. 
(a) Use the previous exercise and Exercise 8 of Section 2.2 to show that ¢ = C/D where 
C,Dé F[ou,...,on] are relatively prime in F[x1,...,xn]. 
(b) Show that AD = BC and then use unique factorization in F[x1, ...,%n] to show that A and 
B are constant multiples of C and D, respectively. 
(c) Conclude that A,B € F[oy,...,on] aS claimed. 
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CHAPTER 3 


ROOTS OF POLYNOMIALS 


This chapter will study the roots of a polynomial in one variable. We will first 
show that every nonconstant polynomial with coefficients in a field has roots in some 
possibly larger field. Then, in the special case of a polynomial with coefficients in 
the field C of complex numbers, we will show that the roots also lie in C. 


3.1 THE EXISTENCE OF ROOTS 


In this section, we will show that given a field F and a nonconstant polynomial 
f © F{xl, there is a field L containing F which also contains all roots of f. We will 
motivate our construction by considering the complex numbers C. 

So far we have assumed the existence of the real and complex numbers. But if 
we're given just the real numbers R, how do we get C? There are several ways of 
doing this. For example, in 1835, Hamilton defined 


C = {(a,b) |a,beE R}, 
where addition and multiplication are given by 
(a,b) + (c,d) =(a+b,c+d) and (a,b): (c,d) = (ac—bd,ad + be). 
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It is straightforward (though somewhat tedious) to verify that these operations make 
the above set into a field with (1,0) as the multiplicative identity. Furthermore, the 
formula for multiplication implies that 


(0,1). (0,1) = (-1,0) = —(1,0). 


If we let 1 denote (1,0) and i denote (0,1), then this equation becomes i” = —1, and 
we also have 

(a,b) = a(1,0) + b(0,1) =a-14+b-i. 
In this way, we recover the usual description of C as the set of numbers of the form 
a+ bi, where a,b ER. 


A very different definition of C was given by Cauchy in 1847. He worked in the 
polynomial ring R[x] and defined 


(3.1) (x) = x(x) [mod. x? + 1] 
to mean that (x), x(x) € R[x] have the same remainder on division by x* + 1. Then, 


to simplify these congruences, he introduced the symbol i as follows: 


... the symbolic letter i, when substituted for the letter x in a polynomial f(x), 
indicates the value obtained, not by the polynomial f(x), but rather by the 
remainder of the algebraic division of f(x) by x? + 1, when one attributes to x 
the particular value i. 


(See [Cauchy, p. 317].) This allowed Cauchy to replace (3.1) with the equivalent 
statement 


(i) = x(i). 
To illustrate how this works, Cauchy considered the polynomial 


f(x) = (a+ bx)(c +. dx) = ac + bdx? + (ad + be)x. 


The remainder of f(x) on division by x? +1 is easily seen to be ac — bd + (ad + bc)x 
(be sure you see why), so that by the above quotation, f(i) is defined to be the 
symbolic expression ac — bd + (ad + bc)i. The same process, when applied to a + bx 
and c + dx, yields a+ bi and c + di, respectively. From this Cauchy concluded that 


(3.2) (a+ bi)(c+di) = ac — bd + (ad + be)i. 


Thus we have a symbolic construction of the complex numbers using remainders of 
polynomials in R[x] on division by x? + 1. 

From a modern point of view, we can explain Cauchy’s construction as follows. 
In the notation of Section A.1, x7 +1 generates the ideal 


(x +1) = {(@"+ alge Rix} 
in the ring R[x]. This gives the quotient ring 


R[x]/(x? +1) = {g + +1) |g € Rix}, 
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where we are using the coset notation of Section A.1. 
Now, following Cauchy, we take ¢ € R[x] and divide it by x” + 1 to obtain 


b=q: (x +1) +a+bx 


for a unique g € R[x] and a,b € R. Since cosets g+ (x7 +1),h+ (x? + 1) in the ring 
R{[x]/(x? + 1) are equal if and only if g—h € (x? + 1), we see that 


b+ (e241) =atbxt (x? 41). 


It follows that (3.1) is true if and only if ¢ and y give the same coset in R[x] /(x* +1). 
Since the remainder of (a+ bx)(c + dx) on division by x” + 1 is ac— bd + (ad +bc)x, 
we have 


(a+ bx+ (x? +1))(c+dx+ (x? +1)) = (a+bx)(c+dx) + (x? +1) 
=ac—bd+(ad+be)x+ (x? +1). 


Hence Cauchy’s construction of C is equivalent to the quotient ring R[x]/(x? + 1). 
But we can do even better, for we can also interpret Cauchy’s symbolic letter i as 
the coset x + (x? +1). More precisely, if we identify 1 with 1 + (x? +1) and i with 
x+(x?+1), then 
atbxt (x? 41) =a-14+b:i, 


and the symbolic multiplication (3.2) becomes the above multiplication of cosets. 

However, interpreting Cauchy’s construction as R[x]/(x? + 1) gives only a ring 
structure. In order for this quotient ring to bea field, we need (x + 1) to be a maximal 
ideal. The following proposition will be useful. 


Proposition 3.1.1 [f F is a field and f € F |x| is nonconstant, then the following are 
equivalent: 

(a) The polynomial f is irreducible over F. 

(b) The ideal (f) = {fg | g € F|x]} is a maximal ideal. 

(c) The quotient ring F|x|/(f) is a field. 


Proof: The equivalence (b) © (c) is Theorem A.1.12 from Section A.1. It remains 
to prove (a) = (b). 

Suppose f is irreducible and / is an ideal of F[x] such that (f) C1 C Flx]. By 
Theorem A.1.17 from Section A.1, / = (g) for some g € F[x]. Then f € (f) C1 = (g) 
implies that f = gh for some h € F [x]. Since f is irreducible, g or 4 must be constant. 
We will leave it as Exercise 1 for the reader to show that g constant implies J = F [x] 
and h constant implies J = (f). It follows that (f) is maximal. 

Conversely, suppose that (f) is maximal and let f = gh where g,h € F |x]. This 
gives (f) C (g) C F[x]. Since (f) is maximal, (g) must equal either (f) or F[x]. 
In Exercise 1 you will show that the former implies that / is constant and the latter 
implies that g is constant. Thus f is irreducible. | 


Since x* + 1 is irreducible over R, Proposition 3.1.1 implies that R[x]/(x? + 1) is 
a field. This completes our second construction of C. 
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One interesting feature of the above two constructions of C just given is that 
neither contains R. This might seem contradictory, but consider the following: 


e In Hamilton’s construction, a complex number is an ordered pair (a,b) of real 
numbers. In order for this definition of C to contain R, we must identify the 
number a € R with the ordered pair (a,0) € C. 

e In the modern interpretation of Cauchy’s construction, a complex number is a 
coset g+(x?+1). In order for this to contain R, we must identify the number 
a € R with the coset a+ (x7+1) EC. 

Both constructions give one-to-one homomorphisms R —+ C that become inclusions 

after we identify R with its image in C. This motivates the following definition. 


Definition 3.1.2 Given a ring homomorphism of fields p : F — L, we say that Lis a 
field extension of F via p. We will usually identify F with its image 


p(F) ={pla)|a@eF} CL 
and write F CL. 


Recall from Section A.1 that by definition, a ring homomorphism maps | to 1. 
Using this, in Exercise 2 you will show that a ring homomorphism of fields yp: F + L 
is automatically one-to-one and induces an isomorphism y : F ~ y(F). Hence once 
we identify F with y(F) C L via y, we may regard F as a subfield of L. For the two 
constructions of C given above, this gives R c C, as desired. 

Armed with this notion of a field extension, we can prove that every irreducible 
polynomial has a root in an extension field. 


Proposition 3.1.3 Jf f € F [x] is irreducible, then there is an extension field F C L 
and a € L such that f(a) =0. 


Proof: Let! = (f), so that L = F[x]// is a field by Proposition 3.1.1. Furthermore, 
a € F gives the constant polynomial a € F [x], which in turn gives the coset a+] € L. 
Thus we get a natural map y: F - L. In Exercise 3 you will check that y is a 
ring homomorphism, so that using the convention of Definition 3.1.2, we get a field 
extension F C L. 

It remains to show that there is a € L such that f(a) =0. This is surprisingly 
easy. Motivated by Cauchy’s symbolic construction, we set a =x-+/. To prove 
that f(a) = 0, suppose that f = apx" + ---+ an, where a; € F. Then, recalling our 
identification of a € F with the coset a+/ € L, we have 


f(a) = (agp + Da" +--++(an+D) 
= (antl) (x+ 1" +--+ + (antl) 
= (aox"+---+a,)+1=f4+1 
=0+1, 
where the third equality uses the definition of addition and multiplication of cosets, 


and the last uses f+ / =0+/< f—O€/T. Since 0 +/ is the additive identity of L, 
we have f(a) = 0, as claimed. a 
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Recall the elementary fact that a € L is a root of a polynomial f € L|x] if and only 
if x— a is a factor of f in L|x] (this is Corollary A.1.15). Thus, to say that a field L 
contains all roots of f means that f factors as 


f =ao(x—a1)---(x#— On), 
where @,...,Qn € L. When this happens, we say that f splits completely over L. 


Theorem 3.1.4 Let f € F[x] be a polynomial of degree n > 0. Then there is an 
extension field F C L such that f splits completely over L. 


Proof: We will prove this using induction on n = deg(f). If n= 1, then f = 
agx + aj, where ap # 0 and ao,a; € F. Setting L = F and a, = —a,/ao implies that 
f =.a9(x — a) and proves the theorem in this case. 

Now suppose that deg(f) = > 1 and that the theorem is true for n— 1. Since F 
is a field, F[x] is a UFD by Theorem 2.1.1. In particular, f has an irreducible divisor 
Ji. If we apply Proposition 3.1.3 to f; € F [x], then we get an extension field F C F; 
and an element a € Fj such that f;(a,) =Oin F.. 

Since f; is a factor of f, we also have f(a,) =0 in Fj. As noted above, this 
implies that x — a, is a factor of f in F\[x]. In other words, 


f=(x-ai)g 


for some g € Fi |x] of degree n— 1. Applying our inductive hypothesis to g, we get a 
field extension F; C L and elements a2,...,a, € L such that 


& =ao(x— Q2)--+(x— Gn). 
The displayed formulas for f and g show that f splits completely over L. ] 
Mathematical Notes 
This section includes several ideas which are worthy of comment. 


= Identifications. In Definition 3.1.2, we wrote a field extension y : F > Las F CL 
by identifying F with y(F). This might seem like cheating, but it happens all the 
time in mathematics. For example, consider Z C Q. Since Q is the field of fractions 
of the integral domain Z, an element a/b € Q is the equivalence class 


(3.3) 5 = (ed) led €Z, d #0,ad = bc}. 


(See Exercise 4 for the details.) In particular, according to (3.3), an integer n € Z 
doesn’t equal the fraction n/1 € Q, since n is an integer and n/1 is an infinite set of 
ordered pairs of integers. Rather, we have the ring homomorphism 


o:Z-Q 


which sends 7 to n/1, and we write Z C Q by identifying Z with @(Z). This is similar 
to what we did in the discussion preceding Definition 3.1.2. 
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« Construction of Extension Fields. Beginning students in algebra often have 
difficulty with cosets and quotient rings. The key insight is that in a quotient ring 
R/T, elements of the ideal J become zero. This is because r € / gives the coset r+ J, 
which equals the zero coset 0+/, since r—0 € J. Applying this to the situation of 
Proposition 3.1.3, f € (f) means that f + (f) is zero in F[x]/(f). But f + (f) is f 
applied to a = x + (f), so that a is aroot of f in F[x]/(f). 

When f is irreducible, Proposition 3.1.3 also showed that L = F[x]/(f) is a 
field. But in practice, if we are given a nonzero coset g + (f), how do we find its 
multiplicative inverse in L? In Exercise 5 you will show the following: 


e f and gare relatively prime, so that Af + Bg = 1 for some A,B € F [x]. 
e The multiplicative inverse of g + (f) in L = F|x]/(f) is the coset B+ (f). 


While it is important to be able to manipulate cosets at an abstract level, it is also 
often useful to represent them concretely. This means coming up with a method for 
picking a unique element—a coset representative—from each coset. In the case of 
L=F{x|/(f), we will show in Chapter 4 that if f has degree n, then every coset in 
F [x] /(f) can be written uniquely in the form 


Coteyxt hye! + (f), 


where co,...,€n—1 € F. The rough idea is that given a coset g+ (f), we replace g 
with its remainder on division by f, which is a polynomial of degree at most n — 1. 
Furthermore, setting a = x+ (f) as in the proof of Proposition 3.1.3, we can rewrite 
the above expression as 


(3.4) Cot cat -+¢,10" 7), 


When F = R and f = x” + 1, this is what Cauchy did in his construction of C. 

The idea of representing cosets by remainders can be applied to other quotient 
rings as well. For example, the theory of Grobner bases enables one to represent 
elements of the quotient ring 


Fi[xt,---;%nl/fis---s fs) Fists E F[x,.--5%n] 
uniquely by remainders (see [3, Ch. 5, §2]). 


« Construction of Splitting Fields. The proof of Theorem 3.1.4 constructs a field 
over which f € F[x] splits completely by iterating the quotient ring construction of 
Proposition 3.1.3. Hence elements of this field are cosets of cosets of cosets, etc. 
This seems very abstract until one remembers that in modern algebra, we don’t care 
what the objects are; it is their behavior that counts. Since the field has the desired 
behavior, we are content. 

In Chapter 5, we will give a refined version of Theorem 3.1.4 where L is chosen 
to be the smallest extension of F over which f € F(x] splits completely. We will call 
this a splitting field. We will show that splitting fields are unique up to isomorphism. 


= The Complex Numbers. This section began with two constructions of the complex 
numbers C. The one of greater interest to us was Cauchy’s, which eventually led to 
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Proposition 3.1.3. The other construction of C, due to Hamilton, used ordered pairs 
of real numbers. This suggests using triples of real numbers, and Hamilton tried 
hard to define addition and multiplication so that such triples would form a field. He 
didn’t succeed, but on October 16, 1843 he realized that this idea would work with 
quadruples (a, b,c,d) provided multiplication wasn’t required to be commutative. If 
the standard basis of R* is denoted 1,i, j,k, then 


(a,b,c,d) =al+bi+cj+dk, 
and Hamilton defined multiplication so that 1 is the multiplicative identity and 
P=P=PV=-1, ij=—-ji=k, jka=—kj=i, ki=—ik=j. 


These are the famous quaternions. They form a division ring, which is a noncom- 
mutative ring where every nonzero element has a multiplicative inverse. 


Historical Notes 


In solving the cubic and quartic equations, Cardan and Ferrari implicitly assumed 
the existence of roots, just as we did in Chapter 1. Girard, in the early seventeenth 
century, was one of the first to assert the existence of roots, real or imaginary, though 
“imaginary root” did not have a clear meaning in his work. As people became more 
comfortable with complex numbers, the existence of roots evolved into the existence 
of complex roots, which come in complex conjugate pairs when the coefficients are 
real. Thus the eighteenth-century version of the Fundamental Theorem of Algebra 
asserts that every nonconstant polynomial in R|x] factors into linear and quadratic 
factors with coefficients in R. In Section 3.2, we will prove the equivalent statement 
that every nonconstant polynomial in C[x] splits completely over C. 

The first attempt to prove the Fundamental Theorem of Algebra was due to 
D’Alembert in 1746, and at roughly the same time Euler discovered an algebraic 
proof (still somewhat incomplete), to be discussed in the next section. Like Cardan, 
Euler implicitly assumed that the roots exist. In 1799, Gauss noted that Euler’s proof 
in effect made the assumption that 

every equation can be satisfied by a real value of the unknown, or by an imaginary 
value of the form a+ b./—1, or by a value that is not subsumed under any form. 


(See [Gauss, Vol. HII, p. 14].) Gauss criticized this assumption as follows: 
How these magnitudes of which we can form no idea whatscever—these shadows 


of shadows—are to be added or multiplied cannot be understood with the kind 
of clarity required by mathematics. 


The main result of this section, Theorem 3.1.4, answers Gauss’s criticism very 
nicely. Given f € R[x] of positive degree, we can regard f as lying in C[x]. Applying 
Theorem 3.1.4, we get an extension C c L where f splits completely over L. Then, 
as Gauss observes in the first part of the quote, each root of f either lies in R, in 
C, or in L. However, the roots in ZL are no longer “shadows of shadows” but rather 
elements of a field which can be manipulated by the usual operations of algebra, just 
as Euler assumed they could. 
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Gauss’s 1815 proof of the Fundamental Theorem of Algebra uses symbolic meth- 
ods to avoid assuming the existence of roots, though his actual construction was 
quite different from what we did in Theorem 3.1.4. We will say more about Gauss’s 
argument in the next section. 

In his 1847 construction of the complex numbers, Cauchy defined congruences 
modulo an arbitrary polynomial f. He also introduced sums similar to (3.4). How- 
ever, Cauchy did not recognize the importance of f being irreducible, which by 
Proposition 3.1.1 is necessary if we want the quotient ring to be a field. 

The general case of this construction is due to Kronecker. He developed an 
elaborate theory of algebraic quantities in his 1881-1882 treatise Grundziige einer 
arithmetischen Theorie der algebraischen Grossen [Kronecker, Vol. II, pp. 237-387] 
and applied these ideas to the existence of roots in his 1887 paper Ein Fundamentalsatz 
der allgemeinen Arithmetik [Kronecker, Vol. II, pp. 209-240). His version of 
Theorem 3.1.4 uses the language of congruences (rather than cosets) to construct an 
extension F C L in which f € F[x] splits completely. In Chapter 12, we will see how 
Kronecker drew on ideas of Lagrange and Galois to create L using a single quotient, 
rather than the sequence of quotients used in the proof of Theorem 3.1.4. 


Exercises for Section 3.1 


Exercise 1. This exercise is concerned with the proof of Proposition 3.1.1. Suppose that 
f,8,h © F[x] are polynomials such that f is nonzero and f = gh. Also let J = (g). 

(a) Prove that g constant if and only if J = F [x]. 

(b) Prove that A constant if and only if J = (f). 


Exercise 2. Let F and L be fields, and let ¢ : F > L be a ring homomorphism as defined in 
Section A.1. Prove that yy is one-to-one and that we get an isomorphism y: F ~ y(F). 


Exercise 3. Let J C F[x] be an ideal, and define » : F > F[x]/J by yp(a) =a+tJ. Prove 
carefully that y is a ring homomorphism. 


Exercise 4. In your abstract algebra text, review the definition of the field of fractions of an 
integral domain and verify that (3.3) is the correct definition of a/b for a,b € Z, b #0. 


Exercise 5. Let f € F'[x] be irreducible, and let g+ (f) be a nonzero coset in the quotient ring 
L=Fix|/(f). 
(a) Show that f and g are relatively prime and conclude that Af + Bg = 1, where A, B are 
polynomials in F [x]. 
(b) Show that B+ (f) is the multiplicative inverse of g+ (f) in L. 


Exercise 6. Apply the method of Exercise 5 to find the multiplicative inverse of the coset 
L4x+ (2 +x+1) in the field Q[x]/(x? +.x+1). 


3.2 THE FUNDAMENTAL THEOREM OF ALGEBRA 
The Fundamental Theorem of Algebra asserts that every nonconstant f € C[x] splits 
completely over C. In other words, 

f =ao(x—a)---(x-a@y) 


for some ao, @1,...,Q@, € C with ap £ 0. 
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The following proposition shows that there are several different ways of stating 
the Fundamental Theorem. 


Proposition 3.2.1 The following are equivalent: 

(a) Every nonconstant f € C|x] has at least one root in C. 
(b) Every nonconstant f € Cx] splits completely over C. 
(c) Every nonconstant f € R{x| has at least one root in C. 


Proof: For (a) => (b), we use induction on n = deg(f). For the base case n = 1, 
writing f = ax+b= a(x — (—b/a)) shows that f splits completely over C. 

Now suppose that n > 1 and that our assertion is true forn—1. If f € C[x] 
has degree n, then assumption (a) implies that f(a) =0 for some ae C. By 
Corollary A.1.15, this implies that f is divisible by x — a. Thus 


f=(x-a)g 


for some g € C(x] of degree n — 1. By our inductive assumption, g splits completely 
over C, and then the above equation shows that the same is true for f. 

The implication (b) = (c) is clear since R C C. To prove (c) => (a), we must show 
that f = aox” +--- +a, € Cx] has a root in C when n > 0 and ap £0. Let 


(3.5) f= Gpx" ++ +0, 


denote the polynomial obtained by taking the complex conjugates of the coefficients 
of f. In Exercise 1 you will prove that if f,@ € C[x], then 


f= fe. 
Now let h = ff € C[x]. Then 
i=fP=if=ir=h 


implies that h € R[x]. By (c), we can find a € C such that h(a) = 0. Then f(a) f(a) = 
0, so that f(a) = 0 or f(a) = 0. In the former case, a € C is a root of f, and in the 
latter, Exercise 1 will show that a € C is a root of f. This completes the proof of the 
proposition. a 


We next study polynomials of odd degree with real coefficients. 
Proposition 3.2.2 Every f € R[x] of odd degree has at least one root in R. 


Proof: We will use the Intermediate Value Theorem (IVT) from calculus. We know 
that f € R[x] is continuous. Thus, if we can find M > 0 such that 


(3.6) f(-M) <0< f(M), 


then the IVT, applied to f on the interval [-M,M], will guarantee that f(c) = 0 for 
some c € (—M,M). 
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Given f € R[x] of odd degree, we can assume that f is monic by multiplying f by 
a suitable nonzero constant. Then 
fax" tax"! +---+an, 
where n is odd and a,...,@, € R. If we set 
M = |ai|+---+]a,|+1, 
then 
|a;M"—! +-+-+ap| < |ai|M"! + |ao|M"* +--+ lanl 
(3.7) < (lai| + |a2| +--+ + lan|)M" 
<M", 


where the first inequality uses the triangle inequality, the second uses M > 1, and the 
third uses M > |a;| + |a2|+---+ an]. It follows that 


f(M) =M" + (a\M""! +---+an) >0 
since the expression in parentheses has absolute value < M” by (3.7). We also have 
|a,(—M)"~! +.a9(—M)"? +--+ +.@n| <M" 
by a similar argument. Then 
f(—M) = —M"+ (a;\(—M)""! +.a2(—M)""? ++ +n) <0, 


since n is odd and the expression in parentheses has absolute value < M”. 
Thus M satisfies (3.6). As noted above, the proposition follows. a 


Finally, we note the following simple consequence of the quadratic formula. 
Lemma 3.2.3 Every quadratic polynomial in C{x] splits completely over C. 


Proof: Given f = ax? + bx +c € C[x| with a 4 0, the quadratic formula implies 
that the roots of f are given by 


—b+Vb* — 4ac 

2a , 
By Section A.2, every complex number has a square root in C. Hence the above 
roots are complex numbers, which shows that f splits completely over C. . 


We can now prove the Fundamental Theorem of Algebra. 
Theorem 3.2.4 Every nonconstant f € C|x] splits completely over C. 


Proof: Our proof will follow a strategy due to Euler, together with a clever idea 
first used by Laplace. By Proposition 3.2.1, it suffices to prove that 


(3.8) Every f € R[x] of degree n > 0 has at least one root in C. 
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We can write m uniquely in the form 
n=2"k, kodd, m>0. 


Euler’s strategy is to prove (3.8) by induction on m. By Proposition 3.2.2, a polyno- 
mial of odd degree in R[x] has a root in R c C. Hence (3.8) is true when m = 0. 

Now suppose that m > 0 and that (3.8) is true for m— 1. Take f € R[x] of degree 
n= 2k, k odd. We can regard f as a polynomial in C[x], so that Theorem 3.1.4 
implies that there is an extension C C L such that f splits completely over L. We will 
denote the roots of f by a1,...,Q, € L. 

Laplace’s clever idea is to consider the following auxiliary polynomial. Pick a 
real number A, and set 


8a(x) = II (x — (a; + aj) + Aaya;). 


l<icj<n 


This has degree $n(n — 1) = the number of distinct pairs of variables. 
We first claim that g, has coefficients in R. To prove this, consider 


(3.9) G)(x) = II (x — (xi +.xj) + Axix;). 
1<i<j<n 
The identity 
x— (x; +x) + Axx; =x- (xj +x) + Ax jx; 
shows that G) is a product indexed by pairs of distinct variables. It follows easily 
that G) is unaffected by permutations of the x;. Then multiplying out G) gives 
5n(n—1) 
G(x) = S- pilm,- o Xn) X' 
i=0 


where the polynomials p;(x,,...,x,) are symmetric in x1,...,x, since G) is. Also 
note that pi(x1,...,%) € R[x1,...,Xn] since \ € R. Then Corollary 2.2.5 implies that 
pilay,---,Qn) € R since ay,...,O, are the roots of f € R[x]. We conclude that 
dn(n—-1) 
gr(x) = S- Di(ay,..., On) x’ © Rix]. 
i=0 

We next compute the exponent of 2 in the degree of g,. Using n = 2k, the degree 

of g) is given by 
5n(n—1) = 42"k(2"k — 1). 


Since m > 0, we can write this as 
(3.10) deg(g,) = 2"-'k(2™k —1). 


Furthermore, k odd and m > 0 imply that k(2”k — 1) is odd. Thus, even though g) 
has larger degree than f, the exponent of 2 has been reduced by one. 
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It follows that for any real number A, our inductive assumption (3.8) applies to gy. 
This means that g) has arootin C. By definition, the roots of g) are aj +a; — Aa;a,;. 
Thus one of these lies in C. In other words, for each  € R, we can find a pair i, j 
with 1 <i< j <nsuch that 


a;t+a,;— raja; EC. 


Note that the pair i, j depends on A—if we switch to a different value of A, we might 
get a different pair. But as we vary over the infinitely many possible values of A, 
there are only finitely many possibilities for the corresponding pair i, j. This implies 
that there must exist A # yz in R that use the same pair i, 7. Thus 


(3.11) ajta;—Aaqa;EeC and aj;t+a;—paja; €C. 
Subtracting, we obtain 
(aj + ay — Aayaj) — (aj + ay — Hajay) = (u— Aaja; EC, 


and since A  y are real, it follows that aja; € C. Then a; + aj — Aaja; € C implies 
that a; + a; € C. Thus the sum and product of a;, a; are complex numbers. 
Now consider the quadratic polynomial 


(x — a) (x— aj) =x? — (a; +aj)x+ aaj. 


By what we just proved, it has coefficients in C, so that its roots also lie in C by 
Lemma 3.2.3. But the roots are clearly a; and a;. This proves that a;,a; € C. Hence 
f has a complex root, which completes the proof of the theorem. a 


Mathematical Notes 
As usual, this section has some interesting ideas to discuss. 


= Proofs of the Fundamental Theorem. There are many proofs of the Fundamental 
Theorem of Algebra. Students often see a proof in a course on complex analysis, 
but there are also some lovely proofs which use topology. The book [4] discusses 
a variety of proofs of the theorem, including a version of the proof given here. See 
also [6] for another proof and references to some of the many other proofs in the 
literature. 

The proof of the Fundamental Theorem of Algebra given above is one of the more 
“algebraic” proofs. However, a closer inspection shows that our proof has three main 
ingredients: 

e Every polynomial of odd degree in R[x] has a root in R (Proposition 3.2.2). 

e Every complex number has a square root in C (this gives Lemma 3.2.3). 

e Every polynomial splits completely over some extension field. 

Of these three, only the last is purely algebraic. The proof of Proposition 3.2.2 uses 
the IVT, and as shown in Exercise 2, square roots of complex numbers reduce to 
square roots of positive real numbers, which exist by the IVT (if you’re unfamiliar 
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with this argument, do Exercise 3). Since the IVT depends on the completeness of 
R, one could argue that the Fundamental Theorem of Algebra is really a theorem in 
analysis. See [1] for a discussion of these issues. 

Once we have proved the main theorems of Galois theory, we will give an elegant 
proof of the Fundamental Theorem due to Artin in Theorem 8.5.9. 


« Algebraically Closed Fields. The Fundamental Theorem of Algebra leads to the 
following definition. 


Definition 3.2.5 A field F is called algebraically closed if every nonconstant poly- 
nomial in F |x] splits completely over F. 


Theorem 3.2.4 shows that C is algebraically closed. We will see that there are 
other algebraically closed fields. In general, one can prove that every field has an 
algebraically closed extension. 


= Real Closed Fields. Another approach to the question of algebraic versus analytic 
is given by the theory of real closed fields. The basic idea is to make the above 
proof as algebraic as possible. We know from Exercise 2 that the existence of square 
roots of complex numbers follows directly from the existence of real square roots of 
positive real numbers. Then one defines a real closed field to be a field F that has the 
following properties: 


e F has an order relation > compatible with addition and with multiplication by 
positive elements (an element a € F is positive if it satisfies a > 0). 

e Every positive element of F has a square root in F. 

e Every polynomial of odd degree in F has a root in F. 


The field of real numbers is the prototypical example of a real closed field, but it is 
not the only one. 

Given a real closed field F, we can adjoin i= /—1 to F using the methods of 
Section 3.1 (e.g., Cauchy’s method). This gives a field F (1), and one can easily adapt 
the proof of Theorem 3.2.4 to show that F(i) is algebraically closed. Details can be 
found in Exercises 4 and 5 (see [Jacobson, Vol. I, Sec. 5.1] for a complete treatment). 
There is also the related idea of a formally real field, due to Artin and Schreier. These 
fields have an interesting relation to Hilbert’s Seventeenth Problem and are discussed 
in [Jacobson, Vol. II, Ch. 11]. 


Historical Notes 


In 1749 Euler attempted to prove the Fundamental Theorem of Algebra for f € R[x] 
using induction on the exponent of 2 in deg(f). To give the flavor of his proof, 
consider the case when deg(f) = 2”. The idea is to write f as a product 


(3.12) f=gh 


where g,h have degree 2”—!. Euler did this by finding the equations satisfied by the 
coefficients of g,4 and then showing that they have real solutions. It follows that 
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coefficients of g and h can be chosen to be real. Once this is done, our inductive 
assumption implies that g and / have roots in C. 

Euler’s proof has some major gaps, and in 1772 Lagrange wrote Sur la forme 
des racines imaginaires des équations (Lagrange, pp. 479-516] to make Euler’s 
argument more rigorous. Lagrange’s proof is almost complete—the difficulty comes 
when some of the polynomials in the proof have multiple roots, which might cause 
certain denominators to vanish. Lagrange was well aware of this problem and gave 
some very interesting arguments to deal with multiple roots. Many authors (including 
[Tignol] and [2]) accept Lagrange’s argument as complete, though I think that some 
subtle gaps still remain. See [1] and [2] for more on the history of all this. 

We next turn to Gauss’s 1815 proof of the Fundamental Theorem of Algebra, 
which appears in [Gauss, Vol. III, pp. 31-56] (see [5, pp. 292-306] for an English 
translation). The overall strategy of Gauss’s argument is similar to what we did in 
Theorem 3.2.4, with one major exception: he never uses the roots a, of f. He begins 
instead with the universal situation and defines 


z= Il (x — (xj +xj)u+xix;) € Riu,x,21,-..,%n]- 
I<i<j<n 


This is similar to the polynomial G) defined in (3.9), except that u is now a variable. 
Gauss observes that z is a polynomial in x and u whose coefficients are symmetric 
in the x;. Hence the coefficients are polynomials in the o;. He then replaces each o; 
with a new variable u;. This gives a new polynomial 


C=C(x,u,u1,...,Un) € Rix,u,uy,..., un]. 


Thus Gauss is using the isomorphism R[w),... , un] & Rfoi,...,0n], which we proved 
in Section 2.2 using arguments taken from this paper of Gauss. Then, given 


f=x"+ayx"""4+---+a, € Rix], 
he uses the substitution u; ++ (—1)‘a; to send ¢ to a polynomial 
(3.13) Z=Z(x,u) € Rix, ul]. 


In this way, Gauss gets an analog of g)(x) without knowing the roots of f. 

From here, Gauss’s argument departs from what we did in Theorem 3.2.4. One 
difference is that he considers only monic polynomials with nonvanishing discrim- 
inant (to be called separable in Section 5.3). Other aspects of Gauss’s proof are 
discussed in [1]. 

In his proof, Gauss uses the methods of Lagrange, which apply to the universal 
polynomial f studied in Chapter 2. Although these methods are powerful, they can 
be cumbersome to use in practice. What we really need are methods which apply 
directly to any field. This leads to the language of field extensions, which is the main 
topic of the next chapter. 


THE FUNDAMENTAL THEOREM OF ALGEBRA 69 


Exercises for Section 3.2 


Exercise 1. For f € C[x], define f as in (3.5). 
(a) Show carefully that fg = fg for f,g € C[y]. 
(b) Let a € C. Show that f(a) = 0 implies that f(@) = 0. 


Exercise 2. In Section A.2, we use polar coordinates to construct square (and higher) roots of 
complex numbers. In this exercise, you will give an elementary argument that every complex 
number has a square root. The only fact you will use (besides standard algebra) is that every 
positive real number has a real square root. 
(a) First explain why every real number has a square root in C, 
(b) Now fix a+ bi € C with b 0. For x,y € R, show that the equation (x+ iy)? = a + biis 
equivalent to the equations 
x —y' =a, 2xy = b. 
(c) Show that the equations of part (b) are equivalent to 
va ttve +e _b 
2 ; ~ 2x" 
Also show that x 4 0 and that a+ a? + b? is positive when we choose the + sign in the 
formula for x’. 
(d) Conclude that a + bi has a square root in C. 


Exercise 3. Use the IVT to prove that every positive real number a has a real square root. 


Exercise 4. A field F is an ordered field if there is a subset P C F such that: 

(a) Pisclosed under addition and multiplication. 

(b) For any a € F, exactly one of the following is true: a € P,a = 0, or —a € P. 
One then defines a > b to mean a — b € P (so that P becomes the set of positive elements). 
From this, one can prove all of the typical properties of >. Now let F be an ordered field. 
Prove that —1 is not a square in F. 


Exercise 5. Let F be a real closed field. As in the text, this means that F is an ordered field 
(see Exercise 4) such that every positive element of F has a square root in F and every f € F(x] 
of odd degree has a root in F. 
(a) Use Exercise 4 to show that x” + 1 is irreducible over F. Then define F (i) to be the field 
F[x]/(x? +1). By the Cauchy construction described in Section 3.1, elements of F(i) 
can be written a+ bi for a,b € F. 
(b) Show that every quadratic polynomial in F (i) splits completely over F(i). 
(c) Prove that F(i) is algebraically closed. 


Exercise 6. Here is yet another way to state the Fundamental Theorem of Algebra. 
(a) Suppose that f(a) = 0, where f € R[x] and a € C. Prove that f(@) = 0. 
(b) Prove that the Fundamental Theorem of Algebra is equivalent to the assertion that every 
nonconstant polynomial in R[x] is a product of linear and quadratic factors with real 
coefficients. 


Exercise 7. Prove that a field F is algebraically closed if and only if every nonconstant 
polynomial in F'[x] has a root in F. 
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PART Il 


FIELDS 


In the next four chapters, we shift our attention from polynomials to fields. 

We begin by developing the basic language of field extensions in Chapter 4. One 
of the key concepts is the degree of an extension. We also consider the special role 
played by irreducible polynomials. 

Chapter 5 continues our study of fields by considering splitting fields, which are 
fields obtained by adjoining the roots of a given polynomial. This leads naturally 
to the notion of a normal extension. Finally, we introduce the idea of separability, 
which for a polynomial means not having multiple roots. 

We introduce the Galois group in Chapter 6, and we explain how it relates to 
permutations of roots in the case of a splitting field. We also give some nontrivial 
examples and, in an optional section, discuss Abel’s notion of an Abelian equation. 

Finally, Chapter 7 defines the key ideas of a Galois extension and the Galois 
correspondence. After stating and proving the Fundamental Theorem of Galois 
Theory, we give some simple applications. 


CHAPTER 4 


EXTENSION FIELDS 


This chapter will develop the language of algebraic extensions, which is needed to 
prove the main theorems of Galois theory. Recall from Chapter 3 that an extension 
of a field F consists of a field L and a ring homomorphism 


yp: F — L. 


As before, we will identify F with its image y(F) in L. In this way, we will write a 
field extension as F C L. 


4.1 ELEMENTS OF EXTENSION FIELDS 


Given a field extension F C L, elements of the larger field can relate to the smaller 
field in two different ways. 


Definition 4.1.1 Let L be an extension of F, and let a€ L. Then a is algebraic 
over F if there is a nonconstant polynomial f € F |x] such that f(a) = 0. If a is not 
algebraic over F,, then « is transcendental over F .. 
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For example, V2 € R is algebraic over Q, since 2 is a root of x? — 2 € Q|x], and 
¢, = e7/" € C is algebraic over Q, since it is a root of x” — 1 € Q[x]. The numbers 
m and e are transcendental over Q, though this is not easy to prove. 


Example 4.1.2 To show that 2 + V3 is algebraic over Q, consider the polynomial 
(x— V2 — V3)(x— V2+ V3)(x4+ V2— V3)(x+ V24 V3). 


Multiplying this out gives x4 — 10x? + 1. Thus 2+ V3 is the root of a nonconstant 
polynomial in Q[x]. We will return to this example many times. <> 


In Section 4.4 we will generalize Example 4.1.2 by showing that if a, 8 € L are 
algebraic over F, then so are a+ 8 and a@. Furthermore, in Exercise 1 you will 
show that if a 4 0 is algebraic over F, then so is 1/a. This will imply that the set 
{a € L | ais algebraic over F} is a subfield of L. 


A. Minimal Polynomials. When a € L is algebraic over F, there may be many 
nonconstant polynomials in F [x] with a as a root. One of these polynomials is 
especially nice. 


Lemma 4.1.3 [fa € L is algebraic over F, then there is a unique nonconstant monic 
polynomial p € F |x] with the following two properties: 

(a) ais a root of p, i.e., p(a) = 0. 

(b) If f € F |x] is any polynomial with a as a root, then f is a multiple of p. 


Proof: Among all nonconstant polynomials in F |x] with a as a root, there must be 
one of smallest degree. Pick one such polynomial and call it p. Multiplying by a 
constant if necessary, we may assume that p is monic. 
This polynomial certainly satisfies (a). As for (b), suppose that f(a) = 0 for some 
f © F |x]. The division algorithm from Section A.1 gives us polynomials q,r € F |x] 
such that 
f=aqptr, r=0 or deg(r) < deg(p). 


Evaluating this equation at a gives 


O= f(a) = g(@)p(a) +7r(a) =r(a), 


where the last equality uses p(a) = 0. If r had strictly smaller degree than p, this 
would contradict the definition of p, and r = 0 follows. Thus p satisfies (b). 

Finally, to prove uniqueness, suppose that another monic polynomial p satisfied 
properties (a) and (b). Then applying (b) for p to f = p implies that p divides p, 
and reversing the roles of p and p, we see that p divides p. Since these are monic 
polynomials, it follows easily that p = p (see Exercise 2 for details). a 


It is customary to name the polynomial of Lemma 4.1.3 as follows. 


Definition 4.1.4 Let aéL. If a is algebraic over F, then the polynomial p of 
Lemma 4.1.3 is called the minimal polynomial of a over F. 
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Besides the characterization given in Lemma 4.1.3, there are other ways to think 
about the minimal polynomial. 


Proposition 4.1.5 Let a € L be algebraic over F, and let p © F|x] be its minimal 
polynomial. If f € F|x] is a nonconstant monic polynomial, then 


f =p <> f isa polynomial of minimal degree satisfying f(a) =0 
<=> f is irreducible over F and f(a) =0. 


Proof: The first equivalence follows from the proof of Lemma 4.1.3. For the second, 
we prove that the minimal polynomial p is irreducible over F as follows. If p = gh, 
where g,h € F[x] have strictly smaller degree than p, then 0 = p(a) = g(a)h(a) 
would imply g(a) = 0 or h(a) = 0, which would contradict the first equivalence. 
Conversely, suppose that f(a) = 0 and f is irreducible. Hence p divides f 
by Lemma 4.1.3, so that f = ph with h € F[x]. Since f is irreducible and p is 
nonconstant, # must be constant. Then f = p follows, since f and pare monic. um 


Here are some examples of minimal polynomials. 


Example 4.1.6 The minimal polynomial of V2 over Q is x? — 2. This follows from 
the irrationality of /2, which implies that /2 cannot be the root of a polynomial of 
degree 1 in Q[x]. <p> 


Example 4.1.7 For V2+ V3, we showed in Example 4.1.2 that J/2+ V3 is a root 
of x4 — 10x? +1. But is this the minimal polynomial? By Proposition 4.1.5, this is 
equivalent to x* — 10x? + 1 being irreducible over Q. For an explicit polynomial, the 
easiest way to check for irreducibility is by computer. For example, the Mathematica 
command 

Factor|x"4-10x"2+1] 


will produce the output x* — 10x? + 1, which means that the polynomial is irreducible 
over Q. In Maple, the command would be 


factor(x"4-10«x"2+1); 


and again the output x* — 10x? + 1 proves irreducibility over Q. Thus x* — 10x? + 1 
is the minimal polynomial of 2 + V3. In Section 4.2 we will say more about using 
Mathematica and Maple to check irreducibility. <> 


Example 4.1.8 The minimal polynomial of ¢, = e?*'/” over Q is called the nth 
cyclotomic polynomial and is denoted &,,(x). In Chapter 9 we will show that ®, (x) 
has degree ¢(n), where ¢ is the Euler ¢-function from number theory. <p> 


B. Adjoining Elements. We next show how to describe some interesting subrings 
and subfields of a given extension F C L. Given ay,...,Q_, € L, we define 


Flay,..., Qn] = {h(ay,...,Qn) | hE Flxy,...,Xn]}- 
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Hence F[a,...,@n] consists of all polynomial expressions in L that can be formed 
using @|,...,@, with coefficients in F. Then let 


F(a1y--50) = {1G |a,8 € Flay,..-, Qn]; B40}. 


Thus F(a),...,@,) is the set of all rational expressions in the a; with coefficients in 
F. We can characterize F(a1,...,@,) aS follows. 


Lemma 4.1.9 F(a1,...,Qn) is the smallest subfield of the field L containing F and 
Oy. ++, Xn. 


Proof: We leave it as Exercise 3 to show that F(a1,...,@,) is asubfield of L. Thus, 
to prove the lemma, we must show that if K is a subfield of L containing F and 
Q1,.--,Q,, then F(a),...,@,) C K. This is what “smallest” means in the statement 
of the lemma. 

Suppose that K Cc L contains F and a),...,Q@,. Since K is closed under mul- 
tiplication and addition, it follows that p(ai,...,Q@,) € K for any polynomial p € 
F|x,,...,%n]. This shows that F[aj,...,Q@n] C K. Then F(a),...,@,) C K follows 
immediately, since K is a field. | 


Since F(a ,...,Qn) is a subfield of L containing F, we get extensions 
FC F(a,..-,Qn) CL. 


We say that F(a1,...,@,) is obtained from F by adjoining a),...,Q, € L. We can 
use this to construct fields as follows. 


Example 4.1.10 Consider the polynomial x4 — 2 € Q[x]. Over the complex numbers, 
this factors as 


x4 —2 = (x— V2)(x4+ V2) (x —iV2)(x+ iV2), 
since the roots of x4 — 2 are +2, +iV2. It follows that 
Q(V2, V2, i/2, —iv/2) 


is the smallest field over which x* — 2 splits completely. We will see in Section 5.1 
that this is an example of a splitting field. 
This field can be described more compactly as 


(4.1) Q(72, -V2,i72, -iV2) = QU, V2). 
To see why, let K = Q(W/2, —V2, i¥2, —i/2) and L = Q(i, 2). Then K C L follows 
from Lemma 4.1.9, since L contains Q and +V2,+iv/2. For the opposite inclusion, 
note that 

, 2 EK 

i= — . 
V2 


Since K obviously contains Q and V2, we have L C K, and (4.1) follows. <p 
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Lemma 4.1.9 also implies that we can adjoin elements to a field in stages. More 
precisely, we have the following corollary. 


Corollary 4.1.11 [fF C Land a,...,a, € L, then 
F(ay,...,) = F(ay,..., Or) (Opgt,.--5Qn) 
forany|<r<n-1l. 


Proof: The field on the right is obtained by first adjoining a1,...,a, to F to get 


the field F(a;,...,@,) and then adjoining a,41,...,Q@, to F(a),...,a,) to get the 
field F(aj,...,Q)(Q+41,---;@n). This field obviously contains F and the elements 
Q1,--+3Qp,Qr44,---;Qn. Then Lemma 4.1.9 implies that 

F(ay,...,Qn) C F(aq,..., Or) (Qpyt,..+ Qn). 
The opposite inclusion is similar and is left as Exercise 4. r 


Here is a simple example of why this corollary is useful. 
Example 4.1.12 Corollary 4.1.11 implies that Q(V/2, V3) = Q(V2)(V3). Then 
Qc Q(v2) ¢ Q(vV2)(Vv3) = Q(v2, V3) 


shows that we get Q(V/2, V3) by first adjoining V2 to Q and then adjoining V3 to 
Q(V2). Representing an extension this way will be very useful. <P 


We next consider F(ay,...,@,) and F[ay,...,Q@,| when 0,...,@, are algebraic 
over F. We begin with the case of adjoining a single element. 


Lemma 4.1.13 Assume that F C L is a field extension, and let a € L be algebraic 
over F with minimal polynomial p € F |x|. Then there is a unique ring isomorphism 


Fla] ~ Flx]/(p) 
that is the identity on F and maps a to the coset x + (p). 


Proof: Consider the ring homomorphism ¢ : F [x] — L that sends h(x) € F|x] to 
h(a) € L. By definition, the image of y is F[a]. As for the kernel, we claim that 
Ker(y) = (p). To prove this, first note that g € F [x] implies that 


y(ep) = (8) (Pp) = 8(a)p(a) = g(a)0 =0. 


This shows that (p) C Ker(y). For the other inclusion, suppose that f € Ker(y). 
Then f(a) = 0, which by part (b) of Lemma 4.1.3 implies that f is a multiple of p. 
Thus Ker(y) C (p), and Ker(y) = (p) follows. 

Since we know the image and kernel of y, the Fundamental Theorem of Ring 
Homomorphisms (Theorem A.1.9) gives a ring isomorphism 


F(x]/(p) = Fla}. 
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This isomorphism is the identity on F and maps the coset x + (p) to a. Its inverse is 
the isomorphism described in the statement of the lemma. 

Finally, uniqueness follows since a ring homomorphism defined on Fla] is 
uniquely determined by its values on F and a. 7 


This lemma shows how to represent F[a] as a quotient ring. However, we also 
know that the minimal polynomial p of a is irreducible (Proposition 4.1.5). As we 
saw in Proposition 3.1.1, this implies that F[x]/(p) is a field. By Lemma 4.1.13, it 
follows that F[a] is a field when a is algebraic over F. Hence we have proved part 
of the following proposition. 


Proposition 4.1.14 Assume that F C L is a field extension, and let a € L. Then a is 
algebraic over F if and only if F|a| = F(a). 


Proof: When a is algebraic over F, the above paragraph shows that F [a] is a field 
containing F and a. Since F(a) is the smallest subfield of L containing F and a 
(Lemma 4.1.9), it follows that F(a) C Fla]. The opposite inclusion always holds, 
so that F(a) = F[a] when a is algebraic over F. 

For the other implication, suppose that F[a] = F(a). We may assume that a # 0 
since 0 is obviously algebraic over F. Then 1/a € F(a) = F[a] implies that 


1/a@=ap+ayat-+++ ana” 
for some ao,...,@m € F. Thus 
O=—1+apa+aja? +---+ana"*!, 
proving that « is algebraic over F. . 


We next study what happens when we adjoin several algebraic elements to a field. 


Proposition 4.1.15 Let F C L be a field extension, and let a,,...,n € L be algebraic 
over F, Then 


Fla,,..., Qn] = F(a,...,Qn). 
Proof: By the argument used in the proof of Proposition 4.1.14, it suffices to prove 
that F[a1,...,Q,] is a field. We will do this by induction on n. The case n = | is 
covered by Proposition 4.1.14. Now suppose that n > 1 and that Flay,...,a@,—1] is 
a field. We know that f(a,) = 0 for some nonconstant f € F [x]. We can regard f 


as having coefficients in the larger field F[a),...,Qn—1], So that a, is algebraic over 
F |aj,...,Q@n—1]. Then Proposition 4.1.14 implies that 


Flay,..-,Qn—1][Qn] 
is a field. We leave it as Exercise 5 to show that this equals F[a),..., an]. a 


Here is an example of Proposition 4.1.15. 
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Example 4.1.16 Consider Q(V/2, V3). The above proposition shows that this equals 
Q[V2, V3], so that every element of Q(V2, V3) is a polynomial in V2, V3 with ra- 


tional coefficients. Furthermore, since V2" = 2" and v2 = 2",/2, and similarly 
for powers of V3, it follows easily that 


(4.2) Q(V2, V3) = {a+ bV2+cV3+dv6 | a,b,c,d € Q}. 
In Section 4.3, we will show that the representation of elements of Q(V/2, V3) given 
by (4.2) is unique. <> 


Mathematical Notes 
Let us discuss two of the ideas that have appeared in this section. 


= The Structure of Fields. Consider a field of the form F(a ,...,a,). In Propo- 
sition 4.1.15, we studied the case when the q; are all algebraic over F. The other 
extreme is when the a; are not only transcendental over F but also algebraically 
independent, which (as defined in Mathematical Notes to Section 2.2) means that the 
a; satisfy no nontrivial polynomial relation with coefficients in F. In Exercise 6 you 


will show that this implies that F(@1,...,@,) is isomorphic to the field of rational 
functions F(x,...,%). We call F(a1,...,@,) a purely transcendental extension of 
F in this case. 

For the general case, a result of Steinitz says that a field L = F(ay,,...,@,) can 


always be written in the form 


FCK=F(f,..-,8m) CK()---5%) =L 


where m <n, 81,..., Bm are algebraically independent over F (so that F C K is purely 
transcendental), and +y,,...,y, are algebraic over K. A proof of this theorem can be 
found in {Jacobson, Vol. IT, Sec. 8.12]. 


= Number Fields. A field of the form Q(a1,...,@,), where a,...,@, are algebraic 
over Q, is called a number field. The fields appearing in Examples 4.1.10 and 4.1.16 
are number fields. These fields and their Galois theory occupy a central role in 
algebraic number theory. 


Historical Notes 


Fields have been used implicitly ever since the discovery of addition, subtraction, 
multiplication, and division. Cardan’s formulas, dating from the sixteenth century, 
use Q, R, and C. The field of rational functions in n variables arises naturally when 
considering symmetric functions, and Lagrange used such fields (implicitly) in his 
1770 study of the roots of polynomials. Number fields also appeared around this 
time. For example, Euler used the fields Q(,/—2) and Q(./—3) to study problems 
in number theory raised by Fermat. 

The first reasonably general definition of F(a,...,a@,) was given by Galois in 
1831, where he says the following: 
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One can agree to regard as rational all rational functions in a certain number of 
determinate quantities, which are supposed to be known a priori. For example, 
one can choose a certain root of an integer, and regard as rational all rational 
functions of the radical. 

When we agree in this way to regard certain quantities as known, we say that we 
ADJOIN them to the equation that we are trying to solve. 


(See [Galois, p. 45].) This is why we say that F(a,,...,a@,) is obtained from F by 
adjoining a ,...,Q. On the other hand, Abel was the first person to understand that 
F [a] = F(a) when a is algebraic over F (see [Abel, Vol. I, pp. 66-72]). 

Abel, Galois, and their predecessors tended to work with explicitly constructed 
fields. The first truly “abstract” notion of field is due to Dedekind. In 1871, he gave 
the following definition: 

I call a system A of numbers a (not all zero) a field when the sum, difference, 

product and quotient of any two of numbers in A also belongs to A. 
(See [3, p. 107].) This is not completely general, for the “numbers” in this definition 
are all complex. From our point of view, Dedekind is really defining a subfield of C. 
But his definition is modern in spirit, in that he allows any set (he says “system” 
because set theory was not fully established in 1871) that behaves nicely under 
addition, subtraction, multiplication, and division. This is very different from his 
great contemporary Kronecker, who took a more conservative view and only dealt 
with fields that could be constructed explicitly in finitely many steps. It wasn’t until 
1893 that Weber gave the first fully abstract definition of field. Weber’s definition is 
similar to the one in use today. A discussion of the evolution of the field concept can 
be found in [7]. See also [8] for the evolution of the ring concept. 


Exercises for Section 4.1 
Exercise 1. Let a € L\ {0} be algebraic over a subfield F. Prove that 1/a is also algebraic 
over F. 


Exercise 2. Complete the proof of Lemma 4.1.3 by showing that if f and g are monic 
polynomials in F [x] each of which divides the other, then f = g. 


Exercise 3. Suppose that F C L is a field extension and that a1,...,@,€ L. Show that 
F[a,...,@n] is a subring of L and that F(a,...,@n) is a subfield of L. 


Exercise 4. Complete the proof of Corollary 4.1.11 by showing that 
F (ay,.++,Qr)(Qrtiy+++5On) C F(an,..., Qn). 


Exercise 5. Prove carefully that F[ai,...,@—1][@n] = F[an,..., an]. 


Exercise 6. Suppose that F C L and that 01,...,Q@, € L are algebraically independent over F 
(as defined in the Mathematical Notes to Section 2.2). Prove that there is an isomorphism of 
fields 


F(aa,...,0n) ~ F(x1,..-,%n), 
where F (x1,...,%n) is the field of rational functions in variables x1,...,Xn- 


Exercise 7. In the proof of Proposition 4.1.14, we used the quotient ring F[x]/(p) to show 
that F [a] is a field when a is algebraic over F with minimal polynomial p € F [x]. Here, you 
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will prove that Fa] is a field without using quotient rings. Since we know that F[a] is a ring, 
it suffices to show that every nonzero element 8 € F[a] has a multiplicative inverse in F [a]. 
So pick 6 £ 0 in Fla]. Then 8 = g(a) for some g € F [x]. 
(a) Show that g and p are relatively prime in F[x]. 
(b) By part (a) and the Euclidean algorithm, we have Ap + Bg = | for some A,B € Fix]. 
Prove that B(a) € F [a] is the multiplicative inverse of g(a). 
Do you see how this exercise relates to Exercise 5 of Section 3.1? 


Exercise 8. If a polynomial is irreducible over a field F, it may or may not remain irreducible 
over a larger field. Here are examples of both types of behavior. 
(a) Prove that x” — 3 is irreducible over Q(V/2). 
(b) In Example 4.1.7, we showed that x* — 10x” + | is irreducible over Q (it is the minimal 
polynomial of a = /2+ V3). Show that x* — 10x? + 1 is not irreducible over Q(V3). 


4.2 IRREDUCIBLE POLYNOMIALS 


Since minimal polynomials are irreducible, it should be clear that the notion of 
irreducibility plays an important role in field theory. However, given an arbitrary 
polynomial f € F [x], it may not be obvious that f is irreducible. How do we tell? In 
this section, we will discuss some ways of answering this question. 


A. Using Maple and Mathematica. In the previous section we saw examples 
of how Maple and Mathematica factor polynomials over Q into irreducibles. These 
programs can also factor over number fields, which as in Section 4.1 are fields of the 
form Q(a1,...,@,) with a1,...,Q, algebraic over Q. 

We first describe how Maple factors polynomials over a number field. In Sec- 
tion 4.1 we used the factor command to show that x* — 10x?+ 1 is the minimal 
polynomial of /2 + V3 over Q. To study this polynomial over Q(./2), we use 


factor(x4-10«*x°2+1, sqrt(2)); 
which gives the result 
(x2 —2V2x— 1)(x2 + 2V2x-1). 


This implies in particular that x? — 2\/2x — 1 and x7 + 2\/2x — 1 are irreducible over 
Q(V2). Similarly, the command 


factor(x"4-10«x2+1, [sqrt(2), sqrt(3)]); 
gives 
(4.3) (x— V2 — V3)(x4+ V24 V3)(x- V24 V3)(x+ V2— V3). 


This is the factorization of x* — 10x? + 1 over Q( V2, V3). 
Not all number fields have such simple descriptions. For example, consider the 
field Q(V2+ V3). Since the minimal polynomial of V2 + V3 is x* — 10x? +1, 
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Maple would represent this algebraic number using the Root Of command. This is 
done most conveniently via 


alias(alpha = RootOf(x"°4-10*x°2+1)): 


which makes @ a root of x4 — 10x?+ 1. Then we can factor a polynomial poly in 
Q|x] or Q(a)[x] using the command factor(poly, alpha). For example, if we let 
f =x‘ — 10x? + 1, then we know that x — a is a factor of f over Q(a). But what are 
the other factors? Using the command 


factor(x“4-10*x"2+1,alpha); 
we get the result 
(4.4) (x— a) (x+ a)(x— 10a + a3)(x+ 100 — a°). 


The surprise is that the polynomial factors completely. This has an interesting 
consequence concerning the fields Q(/2+ V3) and Q( V2, V3). 

In Maple, the factorization (4.4) takes place in Q{[x]/(x* — 10x?+1). To get 
something involving numbers, consider the map x ++ /2+ V3. This induces an 
isomorphism 


Ql] /(x* — 10x? + 1) ~ Q(V2+4+ V3) 


and allows us to assume that a = /2+ V3 in (4.4). 

By comparing (4.3) and (4.4), we conclude that a3 — 10a = +(/2— V3), and then 
an easy numerical calculation shows that a? — 10a = /2— V3. Since a = V2+ V3, 
adding these two equations gives a? — 9a = 2/2, and it follows that V2 € Q(a) = 
Q(V2+ V3). Then we also have V3 = a — V2 € Q(V2+ V3). Then Lemma 4.1.9 
implies that 


Q( V2, V3) C Q(V2+ V3). 
Since the opposite inclusion clearly holds (be sure you can explain why), we get 
(4.5) Q( v2, V3) = Q(v2+ V3). 


We can also do these computations in Mathematica. For example, factoring 
x* — 10x? + 1 over Q(V2) is done by the command 


Factor|[x°4-10x°2+1,Extension -> {Sqrt/2]}] 
and factoring over Q(/2, V3) is done via 
Factor|[x“4-10x°2+1,Extension -> {Sqrt/2],Sqrt/3] }] 


Finally, to work over the field generated by a root of an irreducible polynomial such 
as x* — 10x? + 1, one sets 


a = Root|x*4-10x°2+1, 1] 
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Then the command 
Factor|[x*4-10x"2+1,Extension -> {a}] 


produces a result similar to (4.4), except that a is replaced with the ungainly expression 
it represents. To get a nicer result, one should use the command 


Factor[x“4-10x"2+1,Extension -> {a}] /. a->b 
which gives the result 
(x — b)(x + b)(x— 10b + b*)(x+ 10b — b°). 


In general, Maple and Mathematica have roughly equivalent capabilities for comput- 
ing with algebraic numbers. 


B. Algorithms for Factoring. The use of Maple and Mathematica to factor 
polynomials over number fields implies the existence of an algorithm for doing so. 
To give the reader an idea of how factoring is done, we will describe an algorithm 
for deciding whether f € Z[x] is irreducible over Q. The key tool is Gauss’s Lemma, 
which is Theorem A.3.2. We will use the following corollary of this result. 


Corollary 4.2.1 If f € Z|x] has degree > 0 and is reducible (i.e., not irreducible) 
over Q, then f = gh where g,h € Z[x] have degrees strictly smaller than deg(f). 


Proof: If f is reducible in Q[x], then f = gh), where g1,h; € Q|x] have degrees 
< deg(f). By Gauss’s Lemma, there is 6 € Q such that g = dg; and h = 6—'hy have 
integer coefficients. Then f = gh is the desired factorization. r 


We now describe an algorithm to test the irreducibility of f € Z[x]. Let n= 
deg(f) > 0. First note that if f(i) = 0 for some 0 < i<n-—1, then x—iis a factor 
of f and we can quit. Hence, when performing the algorithm, we may assume that 
f(0),..., f(a —1) are nonzero. Then create a set of polynomials as follows: 

e Fix an integer0 <d <n. 
e Fix divisors ao,...,a¢ € Z of f(0),...,f(d) € Z. 
e Use the Lagrange interpolation formula from Exercise 1 to construct a polynomial 

g € Qa] of degree < d such that g(i) = a; fori =0,...,d. 

e Accept g if it has degree d and integer coefficients; reject it otherwise. 


Doing this for all 0 < d <n and all divisors ao|f(0),...,aa|f(d) gives a set of 
polynomials g € Z{x]. 


Proposition 4.2.2 This set of polynomials g € Z|x] is finite, and f is irreducible over 
Q if and only if it is not divisible by any of the polynomials in this set. 


Proof: We are assuming that f(0),..., f(d) are nonzero, so that each f(i) has only 
finitely many divisors. Hence there are only finitely many choices for ao,...,aq € Z. 
Since g is uniquely determined by the a;, it follows that there are only finitely many 
such g’s. 
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To finish the proof, we will show that f is reducible if and only if it is divisible by 
one of these polynomials. One direction is obvious. For the other direction, suppose 
that f is reducible. By Corollary 4.2.1, f = gh, where g,h € Z|x] and g has degree 
d,QO<d<n. 

Then, for 0 <i <d, let a; = g(i), and note that a;|f(i), since f(i) = g(Aa(i). 
The Lagrange interpolation formula gives g € Q|x] of degree < d with g(i) =a; for 
0<i<d. Since g — g has degree at most d and vanishes at the d + | numbers 0,...,d, 
it must be the zero polynomial. Hence g = g is on our list. a 


Since there are known algorithms for factoring integers, there is an algorithm for 
computing the set of polynomials in g € Z|x] used in Proposition 4.2.2. Then dividing 
these into our given polynomial f via the division algorithm gives an algorithm for 
deciding whether / is irreducible over Q. 

From a computational point of view, this algorithm is dreadful. The methods used 
by Maple and Mathematica are much more efficient. The book [1] describes some 
good algorithms for factoring polynomials over a number field. 


C. The Sch6nemann-Eisenstein Criterion. While algorithms and computers 
can be extremely helpful in computing examples of irreducible polynomials, there 
are certain classes of polynomials that can be proved to be irreducible by traditional 
means. Here, we will prove the Sch6nemann—Eisenstein irreducibility criterion. 


Theorem 4.2.3 Let f = a,x" +---+ao € Z[x] have degree n > 0. If there is a prime 
p such that pt ap, p\an—1,.-., p\ao, and p*{ ao, then f is irreducible over Q. 


Proof: By Corollary 4.2.1, if f is reducible over Q, then there are g,h € Z|x] of 
degree <n such that f = gh. Now consider the ring homomorphism Z|x] — F, [x] 
defined by sending g = byx”" +---+bo € Z[x] to J = [bmlx”™ +--- + [bo] € FEI, 
where [b] € F, = Z/pZ is the congruence class modulo p of b € Z. 

Then f = gh implies that [a,|x” = Zh, since p|a,_1, ..., p|ao. However, F, is a 
field, which means that unique factorization holds in F,[x]. Since p{a,, it follows 
that g = [a]x’ and h = [b]x°, where [a][b] = [a,] and r+s=n. 

If r= 0, then g = [a] and deg(g) > 0 would imply that the leading term of g is 
divisible by p. Then f = gh would imply that the same is true for the leading term 
a, of f. Thus pf{a, implies that r > 0, and s > 0 follows similarly. 

But then g = [a|x’ for r > 0 implies that p divides the constant term of g, and the 
same is true for the constant term of h, since s > 0. Since the constant term ao of f 
is the product of the constant terms of g and h, it follows that p?|ao. This contradicts 
p’ {ao and completes the proof. rT 


Here is a simple example to illustrate the Schonemann-Eisenstein criterion. 
Example 4.2.4 Consider the polynomial 
f=x"+pxtp, n> 2, pprime. 


The Schonemann-Eisenstein criterion for the prime p implies immediately that f is 
irreducible over Q, no matter what n > 2 we choose. <p> 
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The interesting feature of this example is that it cannot be done by Maple or 
Mathematica. For a specific n and p, we could check irreducibility by computer 
(assuming n and p aren’t too big), but standard computer algebra systems can’t factor 
polynomials with symbolic exponents. On the other hand, only very special polyno- 
mials satisfy the Schonemann-Eisenstein criterion. (If @ is a root of a polynomial 
satisfying this criterion, then from the point of view of algebraic number theory, the 
extension Q C Q(aq) is totally ramified at p, which is a rather rare phenomenon.) 

We can use the Schonemann—Eisenstein criterion to determine the minimal poly- 
nomial of the pth root of unity ¢, = e*"'/?, where p is prime. Using 


xP —1 = (x—1)(xP 1 4--- 4-441), 
we see that ¢,, is a root of ®, =x?~! +---+x+ 1. This is called the pth cyclotomic 
polynomial. 
Proposition 4.2.5 @, = x? —l4...4x41 is irreducible over Q when p is prime. 
Proof: First observe that ®,(x) = (x? — 1)/(x— 1), so that 


®,(x+1)= a ony 


The binomial theorem tells us that 


(x+1)P =x? + (Tat bet (Parr tnt ( P eth 
p- 


and then substituting this into the above formula for ®p(x + 1) gives 


(4.6) @,(xt+ 1) =x? 1+ (Tarte (7) feet (, ). 


r -1 
However, for 1 <r < p— 1, the integer 


(”) __ Pl _ P(p-1)---(p=rt)) 
r) — r(p—r)! r! 

is divisible by p, since p divides the numerator but not the denominator (remember 
that p is prime). Furthermore, note that p* does not divide (,? ,) = p. Then ®,(x+ 1) 
is irreducible, since (4.6) satisfies the Schonemann—Eisenstein criterion. 

From here it is easy to see that ®, is irreducible, for a factorization ®,(x) = 
g(x)h(x) in Q[x] would imply ®,(x+ 1) = g(x+ 1)h(x+ 1). If g and h have degree 
< p—1, then the same would be true for g(x+ 1) and A(x + 1), which would contradict 
the irreducibility of ,(x+ 1). This completes the proof. r 


It follows that the minimal polynomial of ¢, over Q is x?~!+---+x+1. In 
Chapter 9 we will describe the minimal polynomial of ¢, for arbitrary n. 


D. Prime Radicals. Given a prime p, our final task is to investigate when the 
polynomial x? — a € F [x] is irreducible over F. Note that if a is a root of x? —a, then 
a? =a, So that the roots of x? — a are the pth roots of a. Here is our result. 
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Proposition 4.2.6 Let p be prime. Then f =x? —a € F|x] is irreducible over F if 
and only if f has no roots in F. 


Proof: One direction is obvious, for if f has a root a € F, then x— a € Fx] is 
a factor of f by Corollary A.1.15. Going the other way, we will assume that f is 
reducible and prove that f has a root in F. 

We first study the roots of f. By Theorem 3.1.4, there is a field F C L over which 
f splits completely, say 


(4.7) f =(x-—a1)(x—a2)---(x— ap), G1 ,...,Qp EL. 


If a, = 0, then f has a root in F. Thus we may assume that a, # 0. If we set 


gaat 


=, 
for 1 <i < p, then a? =a implies that 


p 
p_ 


a 
P= cl. 
a, a 


It follows that a; = ¢;a1, where ¢; is a pth root of unity. Hence (4.7) can be written 


(4.8) f= (x—a1)(x- Gan) ++: (x-¢, a1). 


Now suppose that f = gh, where g,h € F|x] have degree r,s < p. We may assume 
that g, 2 are monic by multiplying them by suitable constants if necessary. By f = gh 
and unique factorization, g must be a product of r of the factors of (4.8). After 
relabeling if necessary, we may assume that 


8 = (x—G,a1)(*— Ga) ++ (*—- C01). 
Since the constant term of g lies in F, this implies that 
Cale F, where¢ =¢,---¢,. 


Note also that ¢? = 1. 
Since 0 <r < p and pis prime, mr+np = 1 for some m,n € Z. Then 


GMa = Cra"? = (Cal)"(al)" € F 
since Cal € F and a} =a € F. It follows that ¢”a, € F. Thus 
(c™ar)” = (Pay =a 
shows that ¢"a, is aroot of f = x? — a lying in F. 7 


The pth roots of unity used in the above proof are more abstract than the roots of 
unity constructed in Section A.2. 
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Here is an easy application of Proposition 4.2.6 that will be useful when we study 
the casus irreducibilis in Chapter 8. 


Example 4.2.7 Let F be a subfield of R and p be an odd prime. Given a € F, we 
define y/a to be the real pth root of a. Furthermore, since p is odd, y/a is the only 
real pth root of a (be sure you understand why). Then Proposition 4.2.6 implies that 
x? — ais irreducible over F if and only if y/a ¢ F. <> 


Historical Notes 


The factorization algorithm for polynomials in Q[x] is due to Kronecker and was 
part of his constructive approach to algebra. Precise references can be found in [4]. 

The Schonemann-HEisenstein criterion was published by Schénemann in 1846 and 
independently by Eisenstein in 1850. Although it is often called the “Eisenstein 
criterion,” Schonemann’s name should be included, since he proved it first. See [4] 
and [9, p. 254] for references to the original papers. See also [2]. 

The slick proof of the irreducibility of ®,(x) given in Proposition 4.2.5 is due 
to Eisenstein. In Chapter 15 we will explore the fascinating mathematics that led 
Eisenstein to the irreducibility criterion. 

Schonemann discovered the criterion in a very different context. He asked whether 
a polynomial that is reducible modulo p remains reducible modulo p?. His version 
of the criterion states that polynomials of the form (x — a)" + pF (x), where a € Z, 
Fx] € Z[x], and p{ F(a), are always irreducible. You will prove this in Exercise 2, 
and in Exercise 3 you will use this to give another proof of Proposition 4.2.5. 

The first proof of Proposition 4.2.5 is due to Gauss in 1799 as part of his study 
of regular polygons in Disquisitiones [5]. We will say more about this in Chapters 9 
and 10. In 1818, Gauss gave an interesting application of Proposition 4.2.5. His sixth 
proof of quadratic reciprocity used congruences modulo x?—! +--.+x-+1, which in 
modern terms means that he was working in the quotient ring 


Qla]/(xP 1 +++ +x41). 


Earlier, Gauss had given a proof of quadratic reciprocity (his fourth, in 1811) using the 
pth root of unity ¢, = e?7‘/?, Since x?—! +--.+x+ 1 is irreducible, it is the minimal 
polynomial of ¢,. Combining this with Lemma 4.1.13 and Proposition 4.1.14 gives 
an isomorphism 


Q(6,) = QIG,] = Qlx]/(xP 1 +++ +x41). 


It follows that Gauss’s sixth proof of quadratic reciprocity is a version of the fourth, 
with complex numbers replaced by the above quotient ring. 

Abel used many properties of radicals in his proof of the unsolvability of the 
general quintic. In a manuscript written shortly before his death in 1829, Abel 
proved Proposition 4.2.6 in the special case when Op € F [Abel, Vol. II, p. 229], 
and the general case is due to Kronecker in 1879 [Kronecker, Vol. IV, pp. 75-76]. 
An even more general version of Proposition 4.2.6 is the following 1901 theorem of 
Capelli (see [Chebotarev, p. 294]). 
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Theorem 4.2.8 Let f =x"—a€ F\x]. Then f is reducible over F if and only ifm 
has a divisor d > 1 such that 


b¢, béF, or 
a= 
-4ct, d=4,c€F. 2 


Exercises for Section 4.2 


Exercise 1. This exercise will study the Lagrange interpolation formula. Suppose that F is a 
field and that bo,...,ba,Co,.-.,¢a € F, where bo,...,bg are distinct and d > 1. Then consider 


the polynomial 
d 
_ Wi 
g(x) = Vel] b= bj € F[x]. 
i=0 ji 

(a) Explain why deg(g) < d, and give an example for F = R and d = 2 where deg(g) < 2. 
(b) Show that 9(b;) = c; fori =0,...,d. 

(c) Let A be a polynomial in F [x] with deg(h) < d such that h(b;) = c; for i=0,...,d. Prove 

that h = g. 


Exercise 2. This exercise deals with Schonemann’s version of the irreducibility criterion. 

(a) Let f(x) = (x—a)" + pF(x), where a € Z and F(x) € Zz] satisfy deg(F) < n and 
p{ F(a). Prove that f is irreducible over Q. 

(b) More generally, let g(x) € Z[x] be irreducible modulo p (i.e., reducing its coefficients 
modulo p gives an irreducible polynomial in F,[x]). Then let f(x) = g(x)" + pF(x), 
where F [x] € Z[x] and g(x) and F(x) are relatively prime modulo p. Also assume that 
deg(F) <ndeg(g). Prove that f is irreducible over Q. 


Exercise 3. Use part (a) of Exercise 2 with a = 1 to give another proof of Proposition 4.2.5. 


Exercise 4. For each of the following polynomials, use a computer to determine whether it is 
irreducible over the given field. 

(a) x44 +2? +242 over Q. 

(b) 3x°+ 6x° + 9x4 4 2x3 + 3x? + 1 over Q and Q(W/2). 


Exercise 5. Find the minimal polynomial of the 24th root of unity ¢,, as follows. 
(a) Factor x4 — 1 over Q. 
(b) Determine which of the factors is the minimal polynomial of ¢,,. 


Exercise 6. Let F be a finite field. Explain why there is an algorithm for deciding whether 
f € F [x] is irreducible. 


Exercise 7. For each of the following polynomials, determine, without using a computer, 
whether it is irreducible over the given field. 

(a) x +x+ 1 over Fs. 

(b) x44+x+1 over Fo. 


Exercise 8. Let a € Z be a product of distinct prime numbers. Prove that x" — a is irreducible 
over Q for any n > 1. What does this imply about ¥/a when n > 2? 


Exercise 9. Let k be a field, and let F = k(t) be the field of rational functions in ¢ with 
coefficients in k. Then consider f = x? —t € F[x], where p is prime. By Proposition 4.2.6, f 
is irreducible provided we can show that f has no roots in F. Prove this. 
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4.3. THE DEGREE OF AN EXTENSION 


When F is a subfield of a field L, there is one bit of structure that hasn’t been used 
yet. We know that L is an Abelian group under addition. Furthermore, since F C L, 
the ability to multiply elements of L implies that we can multiply elements of F times 
elements of L. This gives a scalar multiplication, and one can easily check that L 
becomes a vector space over F’. 


A. Finite Extensions. The above paragraph leads to the following definition. 


Definition 4.3.1 Let F C L be a field extension. 
(a) Lis a finite extension of F if L is a finite-dimensional vector space over F. 
(b) The degree of L over F, denoted |L: F], is defined as follows: 


[L:F] = dimrL, if Lis a finite extension of F , 
an oO, otherwise, 


where dimrL is the dimension of L as a vector space over F. 
Here is a simple example. 


Example 4.3.2 For R c C, the usual way of writing complex numbers as a + bi 
shows that | and i form a basis of C as a vector space over R. Thus [C:R] =2. <> 


We can also characterize extensions of degree 1. 


Lemma 4.3.3 An extension F C L has degree |L: F| = | if and only if F = L. 


Proof: If [L: F] = 1, then any nonzero element of L, say 1 € L, is a basis. Thus 
L={a-1|a¢F}=F. The opposite implication is even easier and is omitted. = 


In general, we compute the degree of an extension F C F(a) as follows. 


Proposition 4.3.4 Suppose that F C L is an extension and a € L. 

(a) a is algebraic over F if and only if [F (a): F] < 00. 

(b) Let a be algebraic over F. If nis the degree of the minimal polynomial of a over 
F, then 1,a,...,a"—' form a basis of F(a) over F. Thus [F(a): F] =n. 


Proof: First suppose that @ is algebraic over F with minimal polynomial p, where 
n = deg(p). We need to show that I,a,...,a”~! form a basis of F(a) over F. Since 
F(a) = F[al, every element of F(a) is of the form g(a) for some g € F [x]. Dividing 
g by p gives 


g=Ggptagtayxt-++ +a, 1x" |, 


where g € F [x] and ao,...,@,_1 € F, and evaluating this at x = a yields 


(4.9) g(a) =ayp + aiat+--»+a,_10" |, 
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since p(a) =0. Thus 1,a,...,a@”—! span F(a) over F. To show linear independence, 
suppose that 


-1 
O0O=apt+aiat+---+a,_10" “, 


where ao,...,dn—1 € F. Then a is a root of ag + ajx+-+++a,_1x"' € F[x]. Since 
the minimal polynomial p has degree n, this must be the zero polynomial. Hence 
a; = 0 for all i, and linear independence is proved. Then [F (a) : F] = 1 follows from 
Definition 4.3.1. 

This proves part (b) of the proposition and also one implication of part (a). It 
remains to consider the case when [F(a): F] < oo. If we let n = [F(a): F], then 
F(a) is an n-dimensional vector space over F. This implies that any collection of 
n+1 elements of F(a) is linearly dependent. In particular, 1, a, a’, ..., a” are 


linearly dependent over F. Hence there are ag,...,a, € F’, not all zero, such that 
(4.10) ag +ayat+aza? +---+a4,0" =0. 

As in the previous paragraph, it follows that a is a root of 

(4.11) ag + ayx+anx? +---+a,x" € F[x], 


which is nonzero, since the a;’s are not all zero. Hence a is algebraic over F,, and the 
proposition is proved. 7 


This proposition implies that when the minimal polynomial of a has degree n, 
every 2 € F(a) can be written uniquely in the form 


B=ayptayat---+a,_1a""',  a,...,dn_1 € F. 


In Exercise 1 you will use an argument similar to the proof just given to describe 
unique coset representatives for elements of F[x]/(f). 

Looking back at Example 4.3.2, we see that [C:R] = 2 follows from Proposi- 
tion 4.3.4, since C = R(i) and the minimal polynomial of i over R is x* +1. Here 
are some other examples of Proposition 4.3.4. 


Example 4.3.5 Consider the extension Q C Q(V2). Since the minimal polynomial 
of V2 is x? — 2, the proposition implies that [Q(/2) : Q] = 2 and that 


(4.12) Q(V2) = {a+bV2| a,b € Q}. 
Note also that this representation is unique. <— 


Example 4.3.6 By Example 4.1.7, the minimal polynomial of 2 + V3 over Q is 
x4—10x?+1. Thus [Q(V2 + V3):Q] =4, and every 8 € Q(V2 + V3) can be 


written 
B=atb(V24 V3) +e(V24 V3)? +d(V2 + V3) 


for unique a,b,c,d € Q. <P 
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Example 4.3.7 Let F(x) be the field of rational functions in the variable x with 
coefficients in F. Then Proposition 4.3.4 implies that [F(x): F] = 00, since x is not 
algebraic over F. <> 


B. The Tower Theorem. We can also determine how the degree behaves when 
we have successive extensions F C K C L. The following result is sometimes called 
the Tower Theorem. 


Theorem 4.3.8 Suppose that we have fields F CK CL. 
(a) If |K:F] =o or |L: K] = 00, then [L: F| =00. 
(b) If [K: F] < 00 and [L: K] < 00, then [L: F] = [L:K]|K:F]. 


Proof: We will prove the contrapositive of part (a): if [L: F] < 00, then [K : F] < 00 
and [L: K] < oo. Thus we may assume that L has finite dimension as a vector space 
over F. Let 7,,...,¥y be a basis. Then: 


e One easily sees that K C L is a subspace of L over the field F. Since L has finite 
dimension over F, so does any subspace. Hence [K :F] = dimpK < oo. 

e Take a € L. Since ¥,,...,7%y span L over F, a= we aii, where a; € F. Since 
F C K, we can consider this as a linear combination with coefficients in K. Thus 
Lis spanned over K by a finite set, so that [L: K] = dimgL < oo. 

To prove part (b), let m = [K : F] and n = [L: K], and pick bases ay,...,Qm of K over 

F and (;,...,8, of L over K. We will prove that the mn products 


form a basis of L over F. This will prove the theorem. 

We first show that the a;8; span L over F. Take y € L. Since 6;,...,8, span L 
over K, we can write 7 = Vint b;8;, where b,...,0, € K. Then, since a1,...,Qmn 
span K over F, we have b; = $~”_, aij0;, where a;; € F. Combining these equations, 


we obtain 
m n 


7= 0 (Savai)5j=0 aij 04 8;. 
j=l i=l 1 


i=) j= 


Since a,; € F, this shows that the a,;6 ; span L over F. 
To prove linear independence, suppose that we have a linear relation 


m n 
SS ajaiB; =0 
i=1 j=l 

where a;; € F. As above, we can write this as 


32 (Savas) 8) =0. 
| | 


j= 
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The expressions in the large parentheses all lie in K, and since the 6; are linearly 
independent over K, we conclude that 


m 
So aijai =0 for 1 <jcn. 
i=l 
Since the a; are linearly independent over F and a;; € F, we must have a;; = 0 for 
all i and j. This proves the desired linear independence. a 


Here are two examples of the Tower Theorem. 
Example 4.3.9 We will analyze Q C Q(Vv2, V3) using 
Qc Qv2) ¢ Q(v2, V3). 


Proposition 4.3.4 shows that 1, /2 form a basis of Q(V2) over Q, since x? — 2 is the 
minimal polynomial of 2 over Q. Furthermore, part (a) of Exercise 8 of Section 4.1 
shows that x? — 3 is the minimal polynomial of 3 over Q(/2), so that 1, /3 form 
a basis of Q(V/2, V3) over Q(V2). Thus: 
© [Q(v2, V3) :Q] = [Q(Vv2, V3): Q(v2)][Q(v2):Q] =2-2=4. 
e The proof of Theorem 4.3.8 shows that the products of the bases 1, /2 and 1,3, 
namely 1, /2, V3, V2V3 = V6, give a basis of Q(v2, V3) over Q. 
Example 4.1.16 showed that 1, ./2, V3, V6 span Q(v2, 3) over Q. We now see 
that 1, /2, /3, V6 form a basis that arises naturally from Theorem 4.3.8. 
In Section 4.2, we used Maple and Mathematica to show that Q(v2, V3)= 
Q(V2 + V3) (see (4.5)). We now give a different proof using 


Qc Q(V2+ V3) c Q(Vv2, V3). 


We just showed that [Q(V2, V3): Q] =4, and Example 4.3.6 tells us that the same 
is true for Q c Q(V2 + V3). Then 


[Q( V2, V3): Q] = [Q(v2, V3): Q(V2 + V3)][Q(v2 + V3) :Q] 
gives [Q( V2, V3) :Q(V2+ V3)] = 1. Thus Q(v2+ V3) = Q(v2, v3). <> 


Example 4.3.10 Let w = e?7'/3 and L = Q(w, 7/2). We will compute [L: Q] using 
the extension fields 


Qc Q(V2) C Q(w, V2) =L. 


To determine [Q(/2) : Q], first observe that x3 — 2 is irreducible over Q. Since x3 —2 
has degree 3, one can prove this using Lemma A.1.19 and Proposition A.3.1, though 
itis quicker to use the Sch6nemann—Eisenstein criterion (Theorem 4.2.3) with p = 2. 
By Proposition 4.3.4 we conclude that 


[Q(v2):Q] =3. 
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We next compute [L:Q(v2)]. Recall that x?-+x-+ 1 has roots w and w”, neither of 
which is real. Since Q(v2) C R, x7 +x+ 1 has no root in this field, so thatx? +.x+1 
is the minimal polynomial of w over Q(W/2). Hence 


[L:Q(v2)] =2, 
since L = Q(\/2)(w). Then Theorem 4.3.8 implies that 
[L:Q| = [L:Q(V2)]|[Q(v2):Q] = 2-3 =6. 


We will return to this example often. <I> 


Mathematical Notes 
Here is one of the ideas used in this section. 


= Algebras over a Field. The key idea of this section is that a field extension F C L 
gives L the structure of a vector space over F, so that L is simultaneously a field and a 
vector space. In general, there are many examples of rings that are also vector spaces 
over a field. Here is the general definition. 


Definition 4.3.11 A (possibly noncommutative) ring R is an algebra over the field 
F if R is a vector space over F such that: 

(a) The vector space addition on R is the same as the ring addition on R. 

(b) The scalar multiplication on R is compatible with the ring multiplication: 


(ab)-r=a-(b-r)  foralla,be F andre R, 
a-(rs)=(a-r)s=r(a-s)  forallac F andr,seER. 


A field extension F c L makes L into an F-algebra. Other examples include the 
polynomial ring F[x1,...,x,] and the algebra of n x n matrices Maxn(F). 


Historical Notes 


The idea of representing elements of a field as linear combinations has a long 
history. For example, in 1847 Cauchy took a polynomial f € F(x] of degree n and 
represented elements of F[x]/(f) as linear combinations of the cosets of 1,x,...,x"~'. 
Kronecker also represented elements of extension fields using linear combinations, 
and he was aware of the importance of linear independence. But in all of this work, 
the term “degree” applied only to degrees of polynomials. 

In 1871 Dedekind developed a theory of field extensions that included the concept 
of degree. He writes an extension as A C 2 and gives the modern criterion for 
W1,--+,Wn € 2 to be linearly independent over A. Furthermore, if the w; span 2, then 
he sets (0,A) =n. He also knows Proposition 4.3.4, but only gives special cases of 
Theorem 4.3.8. 


94 EXTENSION FIELDS 


The modern formulation of the results of this section is due to Emil Artin. He 
developed his approach to Galois theory in the 1920s. He turned Dedekind’s (2.,A) 
into the degree [L: F] and made it the centerpiece of his theory of finite extensions. 
Artin profoundly transformed the way people think about Galois theory. We will say 
more about this in Section 6.1. 

For more details on the history of how these concepts developed, we refer the 
reader to [6] and [7]. 


Exercises for Section 4.3 


Exercise 1. In (4.9) we represented elements of F(a) uniquely using remainders on division 
by the minimal polynomial of a. In this exercise you will adapt the proof of Proposition 4.3.4 
to the case of quotient rings. Suppose that f € F[x] has degree n > 0. Prove that every coset 
of F |x]/(f) can be written as 


ao tarx+-+tan—ix" | + (f), 
where @,a1,...,@n—; € F are unique. 


Exercise 2. Compute the degrees of the following extensions: 
(@) Qc QU,¥2). 

(b) Qc Q(v3, 72). 

() Oc Q(V2+4+ v2). 

@ QC QU,V2+ V2). 


Exercise 3. For each of the extensions in Exercise 2, find a basis over Q using the method of 
Example 4.3.9. 


Exercise 4. Suppose that F C L is a finite extension with [L : F] prime. 
(a) Show that the only subfields of L containing F are F and L. 
(b) Show that L = F(a) forany ae L\F. 


Exercise 5. Consider the extension Q C L = Q(W2, V3). We will compute [L : Q]. 
(a) Show that x4 — 2 and x° — 3 are irreducible over Q. 
(b) Use Q C Q(+/2) C L to show that 4|[Z : Q] and [L: Q] < 12. 
(c) Use Q c Q(V3) C L to show that [L : Q] is also divisible by 3. 
(d) Explain why parts (b) and (c) imply that [L:Q] = 12. This works because 3 and 4 are 
relatively prime. Do you see why? 


Exercise 6. Suppose that a and £ are algebraic over F with minimal polynomials f and g, 
respectively. Prove the Reciprocity Theorem: f is irreducible over F (3) if and only if g is 
irreducible over F (a). 


Exercise 7. Suppose we have extensions Lo C Li C --- C Lm. Use induction to prove the 
following generalization of Theorem 4.3.8: 

(a) If [L;: Li-1] = 00 for some 1 <i < m, then [L,, : Lo] = 00. 

(b) If [Li :L;-1] < 00 for all 1 <i < m, then 


[Lm : Lo] = [Lm + Lm—1|[Em=1 > Lm—2] see [L2 Ly\[L : Lol. 
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4.4 ALGEBRAIC EXTENSIONS 


Now that we know the basic properties of the degree of an extension, we can continue 
our study of algebraic elements. We begin with a definition. 


Definition 4.4.1 A field extension F C L is algebraic if every element of Lis algebraic 
over F. 


It turns out that finite extensions are always algebraic. 


Lemma 4.4.2 Let F C L be a finite extension. Then: 
(a) F C Lis algebraic. 
(b) Ifa € L, then the degree of the minimal polynomial of a over F divides [L: F]. 


Proof: Anelementa € L gives F C F(a) C L, and then the Tower Theorem implies 
that [F (a) : F] is finite and divides [L: F]. Then (a) and (b) follow immediately from 
Proposition 4.3.4. a 


Exercise 1 will show that the converse of this lemma is false—there are algebraic 
extensions that are not finite. So a finite extension is an especially nice algebraic 
extension. 

We next explore the structure of finite extensions. 


Theorem 4.4.3 Let F C L be a field extension. Then [L: F] < 00 if and only if there 
are (\,...,;Qm € L such that each a; is algebraic over F and L = F(ay,...,Qm). 


Proof: First suppose that [L: F] < 00. Let a1,...,Qm € L bea basis of L as a vector 
space over F (so that m = dimrL). Then 


L= {aya, +++: +4mQm | 41,-..,4m € F} C F(ay,...,Qm) CL 


proves that L = F(ay,...,Qm). Each a; is algebraic over F by Lemma 4.4.2. 

Going the other way, suppose that L = F(aj,...,@m) where each q; is algebraic 
over F. Let Lo = F and L; = F(ay,...,a;) for 1 <i<m. Then we get field extensions 
(4.13) F=loCL,C:-Cly =L, 


and Corollary 4.1.11 shows that 
L; = F(a4,...,Q4-1, 01) = F(a,...,a%-1) (ai) = Li-1 (ai) 


for 1 <i<m. Since a; is algebraic over F, it is also algebraic over the larger field 
L;-; > F. Then Proposition 4.3.4 implies that 


[L;:Lj-1] = [L;—1 (0%) :Lj-1] < O&O, 


so that every successive extension in (4.13) has finite degree. Then the generalization 
of the Tower Theorem given in Exercise 7 of Section 4.3 implies that 


[L: F] = [Lm : Lo] = [Lm : Lm—1] +++ [L1: Lo] < 00. 


This completes the proof of the theorem. a 
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As an application of the theorem just proved, let us show that the sum and product 
of algebraic elements are algebraic. 


Proposition 4.4.4 Let F Cc L be a field extension. If a, € L are algebraic over F, 
then so are a+ 6 and af3. 


Proof: Theorem 4.4.3 implies that F C F(a, 8) is a finite extension and hence is 
algebraic by Lemma 4.4.2. Thus every element of F(a, 8) is algebraic over F. Since 
a+ B,af8 € F(a, 8), the proposition is proved. . 


Corollary 4.4.5 Given any field extension F C L, the subset 
M = {a€L| ais algebraic over F} 
is a subfield of L containing F. 


Proof: We have F C M since a € F is a root of x —a € F|x], and M is closed 
under addition and multiplication by Proposition 4.4.4. Since —1 € F C M, we see 
that a € M implies -a = —1-a€M. Finally, if a #0€ M, then Exercise 1 of 
Section 4.1 shows that 1 /a € M. It follows that M is a subfield of L. 7 


Here is a classic example of this corollary. 


Example 4.4.6 A complex number z € C is called an algebraic number if it is 
algebraic over Q. By Corollary 4.4.5, we have the field of algebraic numbers 


Q = {ze€C | zis an algebraic number}. 
Later in the section we will prove that Q is algebraically closed. <> 
We next show that being algebraic is transitive in the following sense. 


Theorem 4.4.7 Let F CK CL. Ifa € Lis algebraic over K and K is algebraic over 
F, then a is algebraic over F. 


Proof: Let abe a root of f = 8,x"+---+ 89 € K[x], where §,,...,80 € K are not 
all 0. By hypothesis, each ; is algebraic over F. Then M = F(,,..., 0) is a finite 
extension of F by Theorem 4.4.3. Furthermore, M is constructed so that f € M[x]. It 
follows that a is algebraic over M, which implies that M Cc M(aq) is a finite extension. 
By Theorem 4.3.8, 


[M(a): F] = [M(a):M][M:F] < oo. 


Thus F C M(q) is finite and hence algebraic. This means that every element of 
M(q), including a, is algebraic over F. 7 


Here is an example of this theorem. 
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Example 4.4.8 Theorem 4.4.7 implies that every complex solution of the equation 
(4.14) x" _ (24 V5)x° +3V12x3 + (14 3i)x +-V17 = 0 


is an algebraic number. This follows because the coefficients are obviously algebraic 
over Q. (Do you see why there are no real solutions?) In Exercise 2 you will show 
that the minimal polynomial of a solution of (4.14) has degree at most 1760. <p 


Theorem 4.4.7 also has the following immediate corollary. 


Corollary 4.4.9 If we have field extensions F C K C L where L is algebraic over K 
and K is algebraic over F, then L is algebraic over F. a 


Mathematical Notes 
Here are some of the ideas encountered in this section. 


= The Field of Algebraic Numbers. In Example 4.4.6, we defined Q to be the set 
of all algebraic numbers in C. This field has the following nice property. 


Theorem 4.4.10 The field Q of algebraic numbers is algebraically closed. 


Proof: By Exercise 7 of Section 3.2 it suffices to show that every nonconstant 
polynomial in Q[x] has a root in Q. Given such a polynomial f, we can regard f 
as an element of C[x], since Q C C. Then f has a root a € C by the Fundamental 
Theorem of Algebra, and a is algebraic over Q because f € Q[x]. But Q is algebraic 
over Q by definition, so that a is algebraic over Q by Theorem 4.4.7. Thus f has the 
root a € Q, and we are done. a 


One can also show that if Q C L is an extension such that L is algebraic over Q 
and L is algebraically closed, then L ~ Q. More generally, if F is any field, then 
there is a field F, unique up to isomorphism, such that F is algebraic over F and 
algebraically closed. We call F the algebraic closure of F. (Strictly speaking, F 
is only unique up to a nonunique isomorphism. Hence we should say “an algebraic 
closure” rather than “the algebraic closure.”) A discussion of algebraic closures can 
be found in [Jacobson, Vol. II, Sec. 8.1]. 


« Algebraic Integers. Finally, in addition to the notion of an algebraic number in C, 
one can also define an algebraic integer to be a complex number that is a root of a 
monic polynomial with integer coefficients. For example, 2 and w = (—1 +iV3)/2 
are algebraic integers, since they are roots of x* — 2 and x* +x + l, respectively; but 
one can show that w/2 is not an algebraic integer (see Exercise 3). Algebraic integers 
play an important role in number theory. For example, Euler proved Fermat’s Last 
Theorem for n = 3 by writing x3 + y? = 2? as 


eazy? =(z—y)(z—wy)(z—w’y) 


and using unique factorization in the ring of algebraic integers Z[w]. This subject 
is called algebraic number theory. For an introduction to algebraic number theory, 
including the details of Euler’s argument, see [10, Ch. 9]. 
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Exercises for Section 4.4 


Exercise 1. Lemma 4.4.2 shows that a finite extension is algebraic. Here we will give an 
example to show that the converse is false. The field of algebraic numbers Q is by definition 
algebraic over Q. You will show that [Q : Q] = 00 as follows. 
(a) Given n > 2 in Z, use Example 4.2.4 from Section 4.2 to show that @ has a subfield L 
such that [L: Q] =n. 
(b) Explain why part (a) implies that (Q : Q] = oo. 


Exercise 2. Let a € C be a solution of (4.14). We will show that the minimal polynomial of 
a over Q has degree at most 1760. Let L = Q(V/2, V5, 12, i, /17, a). 

(a) Show that [L:Q] < 1760. 

(b) Use Lemma 4.4.2 to show that the minimal polynomial of a has degree at most 1760. 


Exercise 3. In the Mathematical Notes, we defined an algebraic integer to be a complex 
number @ € C that is a root of a monic polynomial in Z[x]. 
(a) Prove that a € C is an algebraic integer if and only if @ is an algebraic number whose 
minimal polynomial over Q has integer coefficients. 
(b) Show that w/2 is not an algebraic integer, where w = (—1 +iV3)/2. 


Exercise 4. Use (4.10) and (4.11) to prove the following weak form of Lemma 4.4.2: if 
n= |L: F] < oo, then every @ € Lisa root of a nonzero polynomial in F [x] of degree < n. 


Exercise 5. In 1873 Hermite proved that the number e is transcendental over Q, and in 1882 
Lindemann showed that 7 is transcendental over Q. It is unknown whether 7 + ¢ and 7 — e 
are transcendental. Prove that at least one of these numbers is transcendental over Q. 


Exercise 6. Let F be a field. Show that other than the elements of F itself, no elements of 
F(x) are algebraic over F. Thus, even though [F (x): F] = oo by Example 4.3.7, the field 
M = {a € F(x) | ais algebraic over F} of Corollary 4.4.5 is as small as possible, namely F. 


Exercise 7. Suppose that F is an algebraically closed field, and let F C L be an algebraic 
extension. Prove that F = L. 


Exercise 8. In this exercise you will show that every algebraic extension of R is finite of 
degree at most 2. To prove this, consider an algebraic extension R C L. 
(a) Explain why we can find an extension L C K such that x? + 1 has a root a € K. 
(b) Prove that L(a) is algebraic over R(q@) and that R(a) ~ C. 
(c) Now use the previous exercise to conclude that [L: R] < 2 and that equality occurs if and 
only ifL ~C. 


Exercise 9. Prove that a € Q is an algebraic integer if and only if a € Z. 
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CHAPTER 5 


NORMAL AND SEPARABLE 
EXTENSIONS 


This chapter will study some important properties of field extensions. We will begin 
with extensions obtained by adjoining all roots of a polynomial. These splitting fields 
will lead to the idea of normality. We will also consider the idea of separability for 
both polynomials and field extensions. The chapter will end with the Theorem of the 
Primitive Element. 


5.1 SPLITTING FIELDS 


Given a nonconstant polynomial f € F[x], Theorem 3.1.4 shows that there is an 
extension F C L over which f splits completely. In this section we will consider the 
smallest such extension. 


A. Definition and Examples. We begin with a definition. 

Definition 5.1.1 Let f € F[x] have degree n >0. Then an extension F C L is a 
splitting field of f over F if 

(a) f =c(x—ay)---(x—apy), where c € F and a; € L, and 

(b) L= F(aj,...,Qn). 
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Be sure you understand how this captures the idea of the smallest extension over 
which a polynomial splits completely. The existence of splitting fields follows from 
Theorem 3.1.4, for if f € F [x] splits completely as f = c(x —a1)--- (x — Qq) in L[x], 
then F(a1,...,Q,) is clearly a splitting field of f over F. We will prove below that 
all splitting fields of f € F [x] are isomorphic. 

In the subsequent text, whenever we say “LZ is a splitting field of f € F[x];” we 
will tacitly assume that f is nonconstant. 

Here are some examples of splitting fields. 


Example 5.1.2 Q(V2, V3) = Q(+V2,+V3) is a splitting field of (x? — 2)(x? — 3) 
over Q. <I> 


Example 5.1.3 In Example 4.1.10 we showed that 
Q(V2, V2, iV2, -iW/2) = QU, V2). 
Thus Q(i, V2) is a splitting field of x* — 2 over Q. <P 


Example 5.1.4 In Exercise 1 you will prove that the field Q(w, V2) considered in 
Example 4.3.10 is a splitting field of x? — 2 over Q. <I> 


Note that a splitting field of f € F [x] depends on both the polynomial f and the 
field F. For instance: 


a splitting field of x7 + 1 over Q is Q(i); 
a splitting field of x* + 1 over R is C; 
a splitting field of x7 + 1 over C is C. 
Since the roots of a nonconstant polynomial f € F(x] are algebraic over F, it 


follows from Theorem 4.4.3 that a splitting field of f over F is always a finite 
extension of F. We can bound the degree of this extension as follows. 


Theorem 5.1.5 Let f € F[x] be a polynomial of degree n > 0, and let L be a splitting 
field of f over F. Then [L: F] <n. 


Proof: We will prove this by induction on. When n = 1, f = ax+b has the root 
—b/a€ F, since a#0. Thus L = F in this case, and [L: F] < 1! is clear. 

Now suppose that f has degree n > 1, and let L = F(ay,...,@,) be a splitting 
field of f over F. If we write f = (x — a1)g, then the division algorithm implies that 
g € F(ay)(x]. Furthermore, the roots of g are obviously a2,...,@,, so that a splitting 
field of g over F(a) is given by 


F(a) (a2,..-, Gn) = F(a1,Q2,..-,Qn) =L, 


where the first equality follows from Corollary 4.1.11. Since g € F(a; )[x| has degree 
n—1, our inductive hypothesis implies that 


[L:F(a)] <(n-W)}. 


SPLITTING FIELDS 103 


To bound the degree of F C L, we use the extensions F C F(a,) C L. By the 
Tower Theorem (Theorem 4.3.8), we have 


[L: F] = [L: F(a1)|[F (a1): F] < (n-1)![F(a1): F]. 


However, we also know that [F (a1): F] is the degree of the minimal polynomial of 
a, over F, by Proposition 4.3.4. Since f(a) =0, we obtain [F(a,):F] <n, and 
then [L: F] <n! follows. 2 


Sometimes the bound in Theorem 5.1.5 is sharp, meaning that there are cases 
where equality occurs, though the inequality can also be strict. For instance: 
e By Example 5.1.2, Q(/2, V3) is a splitting field of (x? — 2)(x? — 3) over Q and 
has degree 4 < 4! over Q. 
e By Example 5.1.4, Q(w, V2) is a splitting field of x? — 2 over Q and has degree 
6 = 3! over Q. 
We will see in the next chapter that the size of the splitting field is closely related to 
the size of the Galois group of the extension. 


B. Uniqueness. We next study the uniqueness of splitting fields. A given 
polynomial f € F|x] will have many distinct splitting fields. For example, Q(v2) 
and Qt] /(t? — 2) are splitting fields of x? — 2 over Q. The key point is that while 
they are not the same, they are isomorphic. 

In order to prove this result for all polynomials, we need to prove something 
more general. Suppose that we have an isomorphism of fields y : F, ~ Fo, and let 
fi € Fi|x] be a polynomial of degree n > 0. Applying ¢ to the coefficients of f, gives 
a polynomial f2 € Fy|[x]. 

Now let L; be a splitting field of f; over F; for i= 1,2. This gives the picture 


Ly L 
U U 
FR -S #F. 


Although the splitting fields Z; and Lz may be constructed in quite different ways, 
the following theorem tells us that they are always isomorphic. 


Theorem 5.1.6 Given f| € F\|x] and yp: F, ~ Fy as above, there is an isomorphism 
@:L, ~ Ly such that p = Al,,- 


Proof: We will prove this by induction on n = deg(f,) = deg(f.). Whenn = 1, we 
saw in the proof of Theorem 5.1.5 that L; = F, and Lz = Fy. The theorem follows in 
this case by taking Y = y~. 

Now suppose that n > 1. We know that L; = Fi(a1,..., Qn), where a),...,Q, are 
the roots of f;. As in the proof of Theorem 5.1.5, we will use the extensions 


(5.1) Fi CR(o1) chy, 


where Fi (a1) C Ly is a splitting field of g) = f;/(x— a1). We now proceed in the 
following five steps. 
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Step 1. We first create an abstract model for F,(a;). Let hy € Fi[x] be the minimal 
polynomial of a;. We know that /y is an irreducible factor of f; € F,[x], since a; is 
aroot of f;. Thus 


Fi(a1) = Filoi] = Filx]/(A1), 


where we have used Proposition 4.1.14 (for the equality) and Lemma 4.1.13 (for the 
isomorphism). The resulting isomorphism takes a; to x+ (hr). 


Step 2. We next find a root of f: corresponding to a;. The key point is that the field 
isomorphism  : F, ~ F) induces a ring isomorphism ¢ : Fi |x] ~ Fy|x] that takes f; 
to f2. This isomorphism takes factors to factors and irreducibles to irreducibles. In 
particular, h; will map to an irreducible factor h2 of f:. Since f, splits completely 
over Ly, so does h2 (do you see why?). This allows us to label the roots of f2 as 
By,---58n € Ly, where @; is a root of hp. 


Step 3. The root ; of f2 gives the extensions 

(5.2) Fy, C Fo(81) C La, 

where F)(f,) C Ly is a splitting field of g2 = fo/(x— 8)). Asin Step 1, we also have 
F,(81) = Fe[6:] = Fabx)/ (he) 


since hf is the minimal polynomial of 8,. This isomorphism takes /; to x+ (h2). 
Step 4. Since % : Fix] ~ Fy[x] takes h; to A2, it must take (h;) to (h2). This means 
that we get an isomorphism of quotient rings 


Fy[x]/ (hi) & Fabx]/(h2) 


that takes x + (h,) to x+ (h2) and is y on the coefficients. Combining this with 
Steps 1 and 3, we get an isomorphism 


gi: Fi(on) = File] /(h1) & Folx]/ (ha) ~ Fo(81) 
that takes a; to 8; and satisfies yy | Fa: 


Step 5. Finally, since y; : F(a) & Fo(f) takes a) to G; and f; to fy, it also takes 
gi =fi/(x— a) to g2 = fo/(x— fi). As noted above, L; is a splitting field of g; 
over Fj (a), and in the same way, Ly, is a splitting field of g2 over F2()). 


We can now prove the existence of the desired isomorphism between L; and Ly. 
If we combine the extensions (5.1) and (5.2) together with the isomorphisms y and 
v1, then we get the diagram 


Li 1B) 
U U 

(5.3) Fi(a) + F(61) 
U U 
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Since g) = fi /(x—a1) has degree n — 1, Step 5 implies that we can apply the inductive 
hypothesis to g; € F(a) [x] and y : Fi(a1) ~ Fo(61). This gives Gy : Li ~ Lz, whose 
restriction to F;(a@,) is y;. But since y Roe it follows that the restriction of Gy 
to F; is y. Thus ¢7 is the desired isomorphism. . 


When applied to the identity map 1- : F + F and f € F[x], Theorem 5.1.6 implies 
the following uniqueness result for splitting fields. 


Corollary 5.1.7 [fL, and L, are splitting fields of f € F |x], then there is an isomor- 
phism L, ~ Ly that is the identity on F. a 


Because of this corollary, we can now speak of the splitting field of f € F[x], 
provided that we remember that splitting fields are unique up to isomorphism. 

One might wonder why we proved Theorem 5.1.6 if all we wanted was Corol- 
lary 5.1.7. The answer lies in the inductive nature of the proof: if we begin with 
the identity map y = lr : F — F, then the inductive step (5.3) uses the isomorphism 
1 : F(a) ~ F(6,). So if we had stated Theorem 5.1.6 only for the identity, then 
our inductive hypothesis would not apply, since y; need not be the identity. 

We conclude this section with a further application of Theorem 5.1.6. The idea 
is that this theorem gives some interesting isomorphisms of a splitting field. More 
precisely, the following result will play an important role in Chapter 6. 


Proposition 5.1.8 Let L be a splitting field of a polynomial in F [x], and suppose that 
h € F{x] is irreducible and has roots a, € L. Then there is a field isomorphism 
ao :L-+L that is the identity on F and takes a to B. 


Proof: Since h is the minimal polynomial of a, we have an isomorphism 
F(a) = Fla] ~ F[x]/(h) 

that is the identity on F and sends a to x + (h). Similarly, using 8, we have 
F (8) = F[B] = F[x]/(A) 


that is the identity on F and sends 8 to x+(h). As in Step 4 of the proof of 
Theorem 5.1.6, we can put these together to get a field isomorphism 


gp: F(a) = F(8) 


such that (a) = f and ¢ is the identity on F. 

Now suppose that L is a splitting field of f € F[x]. Then f € F(a)[x] and f € 
F(8)|x], which means that L is a splitting field of f over both F(a) and F(). Thus 
we have the following diagram of splitting fields 
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where ¢ takes f to f. Then Theorem 5.1.6 gives @: L ~ L such that @ F(a) = ¥: 
Since ¢ is the identity on F and maps a to 2, o = G is what we want. 7 


Here is an example of this proposition. 


Example 5.1.9 L = Q( V2) is the splitting field of x? — 2 over Q. This polynomial 
is irreducible over Q and has roots +2 € L. Then Proposition 5.1.8 implies that 
there is an isomorphism o : L — L such that 0(/2) = — V2. <P 


In the terminology of Chapter 6, an isomorphism o : L ~ L that is the identity on 
F c Lis an element of the Galois group Gal(L/F). We will use Proposition 5.1.8 to 
construct elements of Gal(L/F) when L is a splitting field over F. 


Exercises for Section 5.1 


Exercise 1. Show that a splitting field of x? — 2 over Q is Q(w, 72), w= 279. 


Exercise 2. Prove that f € F[x] splits completely over F if and only if F is the splitting field 
of f over F. 


Exercise 3. Prove that an extension F C L of degree 2 is a splitting field. 
Exercise 4. Find the splitting field of x° — 1 € Q[x]. 


Exercise 5. We showed in Section 4.1 that f = x+— 10x” + 1 is irreducible over Q. Show that 
L=Q(V2+ V3) is the splitting field of f over Q. 


Exercise 6. Let f € Q|x] be the minimal polynomial of a = 2+ V2. 
(a) Show that f = x* — 4x? +2. Thus [Q(a) : Q] = 4. 
(b) Show that Q(q) is the splitting field of f over Q. 


Exercise 7. Let f =x? —x+1 € Fs{x]. 

(a) Show that f is irreducible over F3. 

(b) Let L be the splitting field of f over F3. Prove that [L: F3] = 3. 
(c) Explain why L is a field with 27 elements. 


Exercise 8. Let n be a positive integer. Then the polynomial f = x" — 2 is irreducible over Q 
by the Schénemann-Eisenstein criterion for the prime 2. 

(a) Determine the splitting field L of f over Q. 

(b) Show that [L: Q] = n(n — 1) when n is prime. 


Exercise 9. Let f € F[x] have degree n > 0, and let L be the splitting field of f over F. 


(a) Suppose that [L: F] = n!. Prove that f is irreducible over F. 
(b) Show that the converse of part (a) is false. 


Exercise 10. Let F C L be the splitting field of f € F[x], and let K be a field such that 
F CKCL. Prove that K C Lis the splitting field of some polynomial in K[x]. 


Exercise 11. Suppose that f € F[x] is irreducible of degree n > 0, and let L be the splitting 
field of f over F. 

(a) Prove that n|[L: F]. 

(b) Give an example to show that n = [L: F] can occur in part (a). 


Exercise 12. In the situation of Theorem 5.1.6, explain why [L; : Fi] = [La : Fy]. 


Exercise 13. Let L = Q(V2, V3). Use Proposition 5.1.8 to prove that there is an isomorphism 
o : L~ Lsuch that (V2) = V2 and o( V3) = —V3. 
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5.2 NORMAL EXTENSIONS 


In this section, we will discover an important property of splitting fields. This will 
lead to the concept of a normal extension. 

Being a splitting field is a very special property of a field extension. For example, 
we will see below that Q(./2) is not the splitting field of any f ¢ Q|x]. The basic 
reason for this lies in the following proposition. 


Proposition 5.2.1 Let L be the splitting field of f © F |x|, and let g € F |x| be irre- 
ducible. If g has one root in L, then g splits completely over L. 


Proof: We can assume that f and g are monic. Then L = F(aq),...,@,), where 
f = (x-a))-+-(x—a,). If 8 € Lis a root of g, then g is the minimal polynomial of 
8 since g is irreducible and monic. We need to prove that all roots of g lie in L. 


Proposition 4.1.15 implies that L = F[a1,...,d,|, so that 8 is a polynomial in the 
ay, ie., 8B = h(ay,...,Q,) for some h € F[x;,...,X,|. Now consider the polynomial 
(5.4) s(x) = Il (x —A(ag(1),--+5Q@o(n))) € Li]. 

a€ES, 


This clearly has all of its roots in L. Furthermore, the factor corresponding to o = e 
is x —h(ay,...,@,) =x— G, so that @ is a root of s. 

If we could show that s € Fx], then g|s would follow immediately, since g is the 
minimal polynomial of 8. Since s splits completely over L, this would imply that g 
also splits completely over L. 

Hence it suffices to prove that s € F |x]. We do this by going to the universal 
situation, as we did for the polynomial g, in the proof of Theorem 3.2.4. The 
polynomial 

S(x) = Il (x —A(xa(1),+++sXa(n))) 


aES, 


has coefficients in F[x,...,x,]. Furthermore, permuting x),...,x, permutes the 
factors of S. It follows that if we multiply out S, then we get an expression 


where each p;(x1,...,%2) € F[x1,---,%n] is symmetric. Since the a; are the roots of 
f € F(x], Corollary 2.2.5 implies that p;(a1,...,0,) € F. We conclude that 


As explained above, the proposition now follows. a 


This proof of Theorem 5.2.1 uses the theory of symmetric polynomials from 
Chapter 2. See [Stewart, Ch. 10] for a proof that doesn’t use symmetric polynomials. 
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Here is the example promised above. 


Example 5.2.2 It is now easy to see why Q(1/2) is not the splitting field of any 
polynomial in Q|x], since x* — 2 is irreducible over Q and obviously has a root in 
Q(/2). If this field were a splitting field, then Proposition 5.2.1 would force x? — 2 
to split completely over Q(v/2). But this is impossible, since Q(/2) C R doesn’t 
contain the complex roots wW/2,w? V2 of x3 —2. <P 


In Exercise 1 you will prove similarly that Q(/2) is not the splitting field of any 
polynomial in Q[x]. 
The property of Proposition 5.2.1 leads to the following definition. 


Definition 5.2.3 An algebraic extension F C L is normal if every irreducible poly- 
nomial in F |x] that has a root in L splits completely over L. 


In Exercise 2 you will show that F C L is normal if and only if the minimal 
polynomial (relative to F) of every a € L splits completely over L. 

The following result reveals the strong link between normal extensions and split- 
ting fields. 


Theorem 5.2.4 Suppose that F C L. Then L is the splitting field of some f € F |x| if 
and only if the extension F C L is normal and finite. 


Proof: First suppose that L is the splitting field of f € F [x]. Then F C Lis finite by 
Theorem 5.1.5 and is normal by Proposition 5.2.1. 

For the converse, suppose that F C Lis normal and finite. By Theorem 4.4.3, the 
finiteness of this extension implies L = F(a1,...,Qm), where each a; is algebraic 
over F. Let p; € F[x| be the minimal polynomial of a;, and set f = pi--- Pm. We 
will show that L is the splitting field of f over F. 

To prove this, first observe that every p; splits completely over L, since F C L 
is normal and p; € F |x| is irreducible with a root a; € L. It follows that f splits 
completely over L. Now let L’ C L be the subfield of L generated by F and the roots 


of f. Since the roots of f include aj,...,a,, we have 
L=F(qy,...,0m) CL’ CL. 
This shows that L’ = L, so that L is the splitting field of f over F. a 


We will see that normal extensions play an important role in Galois theory. 
Historical Notes 


Polynomials similar to s(x) in (5.4) appear in the work of Galois. For example, in 
his first memoir on Galois theory, Galois says the following: 
In fact, by multiplying together all of the factors of the form V — y(a,b,c,...,d), 
where one operates on the letters by all possible permutations, one will get an 
equation rational in V that is necessarily divisible by the equation in question. 
(See (Galois, p. 51].) Here, a,b,c,...,d are roots of a polynomial f € F[x], and 
y(a,b,c,...,d) is an element of the splitting field F(a,b,c,...,d). Then we can 
interpret Galois’s statement as follows: 
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e By saying “an equation rational in V,” Galois is asserting that the resulting poly- 
nomial in V has coefficients in F. This is exactly what we proved about the 
polynomial s(x) in (5.4). 

e@ When Galois says that this is “necessarily divisible by the equation in question,” 
he is referring to the minimal polynomial of y(a,b,c,...,d). This is what we 
called g in Proposition 5.2.1. 

Thus, although normality does not appear explicitly in Galois’s work, the above 

quotation should make it clear that it is implicit in what he does. We will say more 

about Galois’s results in Chapter 12. 


Exercises for Section 5.2 


Exercise 1. Prove that Q(+/2) is not the splitting field of any polynomial in Q[x]. 


Exercise 2. Prove that an algebraic extension F C L is normal if and only if for every a € L, 
the minimal polynomial of a over F splits completely over L. 


Exercise 3. Determine whether the following extensions are normal. Justify your answers. 
(a) QC Q(G,), where ¢, = 77!/". 


(c) F =F3(t) C F(a), where 1 is a variable and a is a root of x? —t in a splitting field. 


Exercise 4. Give an example of a normal extension of Q that is not finite. 


5.3  SEPARABLE EXTENSIONS 


Given a nonconstant polynomial f € Fx] with splitting field F C L, we can write 
(5.5) f =ao(x—a)-+-(x-Qn), ao EF, a,...,Q, EL. 


It is important to realize that a),...,@, are not always distinct. For example, f = 
x? —2x+1 € Q|x] has a; = a2 = 1. In this section, we will study those special 
polynomials for which the roots are all different. 

We begin with some terminology. Given f as in (5.5), let 6),..., 6, be the distinct 
elements of L that appear among @,...,@,, and let m; be the number of times x — 6; 
appears in (5.5). Then we can write (5.5) as 


f =ao(x-6i)"'---(x«-B,)™, aoe F, Bi,..., 6, € L distinct, m,...,m, > 1. 


We call m; the muliplicity of 8; and say that 6; is a simple root if m; = 1 and a multiple 
root if m; > 1. 


Definition 5.3.1 A polynomial f € F |x] is separable if it is nonconstant and its roots 
in a splitting field are all simple. 


In other words, f is separable if it has distinct roots. These definitions are 
independent of splitting field used, since all splitting fields of f over F are isomorphic. 
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One tool used to study separability is the discriminant A(f) € F of a monic 
polynomial f € F [x]. We defined A(f) in Section 2.4 and showed in Proposition 2.4.3 
that if deg(f) > 1, then 


A(f) = Il (a;-a;)? when f =(x—ay)-++(x— an). 


1<i<j<n 


Another tool we will need is the formal derivative, which for a polynomial g = 
agx" + ayx"—!4 +--+ a,-1x +4, € F [x] is defined to be 


g’ = nagx" 4+ (n— Layx" 7 +++) tan}. 
The operation g +> g’ enjoys the usual properties from calculus, including 


(ag + bh)’ =ag'+bh’, 


5.6 


for g,h € F[x] and a,b € F. See Exercise 1 for a proof of (5.6). 
Separability, the discriminant, and the formal derivative are related as follows. 


Proposition 5.3.2 If f € F[x] is monic and nonconstant, then the following are 
equivalent: 


(a) f is separable. 
(b) A(f) #0. 
(c) f and f' are relatively prime in F |x], i.e., gcd(f, f’) = 1. 


Proof: If deg(f) = 1, then A(f) = 1 by the definition of A(f) given in Section 2.4. 
It follows easily that (a), (b), and (c) are all true in this case. Hence we may assume 
that deg(f) =n > 1. 

For (a) = (b), let a1,...,@, be the roots of f in some splitting field. The above 
formula for A(f) shows that A(f) 4 0 is equivalent to a; 4 a; for all i < j. 

It remains to show (a) © (c). Let L be a splitting field of f over F, so that 
f = (x—a)---(x—a,) in L[x]. For a given i, write 


f(x) = (x- ai)hi(x), hile) = [1 jgile— 0)). 


Differentiating, we obtain f’(x) = (x — a;)hj(x) + hj(x) by the product rule, and then 
evaluating at a; gives 


(5.7) Ff’ (ai) = hi(ou) = T]z,(ai — ;). 


If (c) is false, then f and f’ have a common factor g of positive degree. Since g|f, 
we must have g(a;) = 0 for some i, and then g|f’ implies that f’(a;) = 0. Hence 
0= f'(ai) =[],4;(ai — a;), so that a; = a; for some j # i. 

Conversely, if (c) is true, then 1 = Af + Bf’ for some A,B € F{x]. Evaluating 
this at a; gives 1 = B(a;)f'(a;), so that f’(a;) #0. By (5.7), this implies that 
I] <i(o — a;) is nonzero for all i. Hence aj,...,@, are distinct. rT] 
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The definition of separable polynomial given in Definition 5.3.1 is nonstandard in 
that it applies to arbitrary nonconstant polynomials with distinct roots, while most 
books focus on irreducible polynomials with distinct roots. Fortunately, as long as we 
restrict to irreducible polynomials, Definition 5.3.1 is consistent with the literature. 

We can also extend the concept of separability to algebraic extensions. 


Definition 5.3.3 Let F C L be an algebraic extension. 
(a) a € Lis separable over F if its minimal polynomial over F is separable. 
(b) F C Lis a separable extension if every a € L is separable over F. 


Since minimal polynomials are irreducible, this agrees with the definition of 
separable extension given in other texts. 

We can interpret the separability of a polynomial in terms of its irreducible factors 
as follows. 


Lemma 5.3.4 A nonconstant polynomial f € F(x] is separable if and only if f is a 
product of irreducible polynomials, each of which is separable and no two of which 
are multiples of each other. 


Proof: First assume that f is separable. If a factor of f fails to have distinct roots 
in a splitting field, then the same is true for f. Hence any irreducible factor of f must 
be separable. Also, if the factorization of f into irreducibles includes two factors that 
are multiples of each other, then the product of these factors would be a nonseparable 
divisor of f. Hence the factorization of f must consist of separable, irreducible 
polynomials no two of which are multiples of each other. 

Conversely, let f = g) ---g;, where g),...,g, are separable and irreducible, and no 
two are multiples of each other. Then, in the splitting field of f, each g; has distinct 
roots. Furthermore, suppose that g; and g; share a root a for some i# j. Since g; 
and g; are irreducible, this would imply that each was a constant times the minimal 
polynomial of a, which is a contradiction. Hence f is separable. . 


In order to make good use of Lemma 5.3.4, we need to understand when an 
irreducible polynomial is separable. Fortunately, many irreducible polynomials are 
automatically separable. 


Lemma 5.3.5 Let f € F[x} be an irreducible polynomial of degree n. Then f is 
separable if either of the following conditions is satisfied: 
(a) F has characteristic 0, or 


(b) F has characteristic p > 0 and pjn. 


Proof: Let f = agx" +++++@,_\x+a,, where n > 0 and a9 #0. Then f’ = 
nagx"—! + +--+ 4a, —,. Condition (a) or (b) implies that n 4 0 in F, so that ap #0 
implies nag # 0. Hence f’ is nonzero and has degree n — 1. 

Since f is irreducible, its only divisors (up to constant multiples) are 1 and f. 
Hence g = ged(f, f’) must be 1 or f, up to constants. But g|f’ and f’ 4 0 imply 
deg(g) < deg(f’) =n —1. Thus g cannot bea multiple of f, so gcd(f, f’) =g = 1.m 
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One surprise of Lemma 5.3.5 is that separability is related to the characteristic. 
Here is another example of this phenomenon. 


Example 5.3.6 Consider f = x" — 1 € F|x], where n > 0. By Proposition 5.3.2, f is 
separable if and only if f is relatively prime to f’ = nx"—'. However: 


e If n £0 in F, then the only irreducible factor of f’ is x, which clearly doesn’t 
divide f. Thus f is relatively prime to f’ in this case. 

e Ifn=OinF, then f’ is identically zero, in which case f divides f’. Hence f is 
not relatively prime to f’ in this case. 


It follows that x" — 1 € F[x] fails to be separable if and only if F has characteristic p 
and p divides n. <p> 


For the remainder of the section, we will consider fields of characteristic 0 and 
characteristic p separately. Since we encounter fields of characteristic 0 most often, 
we will begin with them. 


A. Fields of Characteristic 0. Here is an application of Lemmas 5.3.4 and 5.3.5. 


Proposition 5.3.7 If F has characteristic 0, then: 

(a) Every irreducible polynomial in F |x| is separable. 

(b) Every algebraic extension of F is separable. 

(c) A nonconstant polynomial f € F(x] is separable if and only if f is a product of 
irreducible polynomials, no two of which are multiples of each other. 


Proof: Part (a) follows immediately from Lemma 5.3.5, and this implies part (b) 
by Definition 5.3.3. Finally, part (c) follows from part (a) and Lemma 5.3.4. a 


In characteristic 0, we can get rid of multiple roots as follows. 


Proposition 5.3.8 Let F have characteristic 0, and suppose that f € F(x} has the 
factorization f = cgi --- gf", where c € F, g; € F|x] is monic and irreducible for 


1<i<l, and g),...,g; are distinct. Then 


fog. 
(5.8) ged ff) “8! 81. 


Furthermore, g, ---g, is separable and has the same roots as f in a splitting field. 


Proof: Proposition 5.3.7 implies that g, ---g; is separable, and this polynomial and 

f clearly have the same roots in a splitting field. Hence it suffices to prove (5.8). 
The factorization f = cgi"'---g;" implies that we can compute ged(f, f’) by 

finding the highest power of g; that divides f’ (do you see why?). If we write 


f = gi hi, h; =c]] zig’ 
then differentiating gives 


m—1 


f' =migh"gihi + gi hi = gi" | (migih; + gik}). 
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This shows that g”"—'|f’. If we had g”" |, then g;|(mig‘h; + gih’), and thus g;|mig'h;. 


t 
Since g; is irreducible, this would force g;|mjg} or g;|h;. The latter is impossible by 


the definition of 4;, and the former is impossible because mig; is nonzero of degree 
deg(g) — 1 (this is where we use characteristic 0). Hence gm! is the highest power 
of g; dividing f’, which implies that 


ged(f, f’) = gh gh. 
The desired formula (5.8) follows immediately. a 


This proposition is more powerful than it seems. For example, suppose that we 
have a polynomial f € F [x] that has multiple roots in a splitting field, say 


(5.9)  f =ao(x— B)™---(x—- B,)™, ao € F, B,,...,8, distinct, mj; > 1. 
If we ignore the multiplicities, then we get the separable polynomial 


& = ao(x— B1)--- (x— B;), 


which has the same roots as f. There are three methods to find g: 


e If we know the roots of f, then we get g from the factorization (5.9). This requires 
knowing the roots, which rarely happens. 


e If we know the irreducible factorization f = cg/''---g/" over F, then we get 
& =cgt-::g; by Proposition 5.3.8. This requires knowing the factorization, which 
can be time-consuming to compute. 


e We get g from the gcd computation given in (5.8) of Proposition 5.3.8. 
In practice, the third method is the most efficient. Here is an example. 


Example 5.3.9 Let f =2x!! — x!0 4 2x8 — 4x7 + 3x5 — 3x4 4.43 +. 3x? —x—1 € Q{y]. 
Using the gcd command in Maple or the PolynomialGCDcommand in Mathematica, 
one finds that 


gecd(f, f’) =x° — 2° +49 —2x7 +1. 
It follows that 


f xt yl 4 248 — 4y7 + 3x9 — 3x4 433432? -—x-1 


(5.10) gcd(f, f’) 79 x +4322 41 


=x5+4+x?—x-1 
is a separable polynomial with the same roots as f. <p> 


B. Fields of Characteristic p. We begin with an important property of such 
fields. 


Lemma 5.3.10 Let F be a field of characteristic p, and assume that a, 8 € F. Then 
(a+ B)? = a? + B? and (a— 8B)? =a? — BP. 
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Proof: The binomial theorem implies that 
(a+ 6)? =a? + (Par "B+. + (PaP" 8’ ++ (,7,)08?"! + BP. 


In the proof of Proposition 4.2.5, we showed that p|(?) for 1 <r < p—1. Since F 
has characteristic p, the above identity reduces to (a+ 3)? = a? + 6?. In Exercise 2 
you will use this to prove that (a — 8)? = a? — B?. a 


In Exercise 3 you will use Lemma 5.3.10 to show that | is the only pth root of 
unity in a field of characteristic p. 

Since (a8)? = a? B?, Lemma 5.3.10 implies that the map a > a? is a ring homo- 
morphism over any field F of characteristic p. This is the Frobenius homomorphism 
of F. We will use Frobenius when we discuss finite fields in Chapter 11. 

Here is our first example of a nonseparable irreducible polynomial. 


Example 5.3.11 Let F = k(t), where k has characteristic p and ¢ is a variable. We 
claim that f = x? —t € F[x] is nonseparable and irreducible over F. 

To prove this, note that f has no roots in F, by Exercise 9 in Section 4.2. Since p 
is prime, Proposition 4.2.6 implies that f is irreducible over F. Furthermore, if a € L 
is a root of f in its splitting field L, then a? =t. Using Lemma 5.3.10, it follows that 


(5.11) (x—a)P =x? —aQP =x? -t. 


Thus f does not have distinct roots in its splitting field L and hence is not separable. 
The polynomial f also gives an example of a nonseparable finite extension. 
Namely, a € L is a root of f, so that f is the minimal polynomial of a over F, 
since f is irreducible and monic. It follows that F C L is not separable. 
Note also that by (5.11), a is the only root of f. Hence the splitting field is 
L= F(a). This implies [L: F] = p, since f is the minimal polynomial of a. <b> 


One caution is that over a field of characteristic p, not all irreducible polynomials 
of degree p fail to be separable. Here is a simple example. 


Example 5.3.12 For the field F2 of two elements, f = x? +-x+ 1 € F)[x] is irreducible, 
since it has no roots in F2. It is also separable, since f’ = 2x +1 = 1 is relatively 
prime to f. <I> 


We will say more about characteristic p in the Mathematical Notes. 


C. Computations. To determine whether a monic polynomial f € F |x] is separable, 
one can use either A(f) or gcd(f, f’) by Proposition 5.3.2. We will briefly discuss 
how to compute both using Maple and Mathematica. 

We begin with a gcd computation. 


Example 5.3.13 Example 5.3.9 explained how Maple and Mathematica do this over 
Q. For example, if 


(5.12) f =x° + 10x? +: 3x? +1 € Qi, 
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one computes gcd(f, f’) = 1, so that f is separable. However, since f has integer 
coefficients, we can reduce modulo p and obtain a polynomial f, € F,[x]. Then we 
can ask whether f, is separable over F,. 

For p = 2 or 3, we have f; =0, since f’ = 6x° + 30x? + 6x. Thus gcd(fp, f) = 
gcd(fp,0) = fp #1, so that f, is not separable for these primes. For a larger prime 
such as p = 557 (the reason for this choice will soon become clear), we compute the 
gcd over F557 using the Maple command 


Gcd(x*6 + 10*x°3 + 3«x°2 + 1,6*x°5 + 30*x"2 + 6*x) mod 557; 


which gives the result x-+ 257. Thus 557 is not separable. In Mathematica, this 
computation is done using the command 


PolynomialGCD|x’6 + 10x°3 + 3x°2+ 1,6x°5 + 30x°2 + 6x, 
Modulus—> 557] 


which gives the same answer x + 257. <L- 


The second approach to studying whether f is separable would be to compute the 
discriminant A(f). Chapter 2 gave a cumbersome method for computing A(f) that 
expresses A = |]; _ ;(x;—x j)* in terms of the elementary symmetric polynomials and 
then evaluates the o; at the coefficients of f (up to the usual sign). A more efficient 
approach uses the resultant. 

We will not discuss resultants in detail, for this would take us too far afield. The 
idea is that for f,g € F(x], their resultant 


Res(f,g,x) € F 
is a polynomial in the coefficients of f and g with the property that 
Res(f,g,x) =0 <=> f and g have a common root in an extension of F. 


An introduction to resultants can be found in [1, Ch. 3, §5] and [3, pp. 97-104]. 
For us, the most important property of resultants is that if f € F[x] is monic of 
degree n > 1, then 


(5.13) A(f) = (-1)2"@-YDRes(f, f",x) 


(see [3, pp. 103—104]). In Maple and Mathematica, the resultant of f,g is computed 
using the commands resultant (f,g,x) and Resultant (f,g,x]. 


Example 5.3.14 As an example, consider the polynomial f = x® + 10x? +3x?+1 
given by (5.12). This leads to 


A(f) = (-1)2%Res(f, f’,x) = —(—649684800) = 2°. 3°. 52.557. 


As before, reducing f modulo p gives f, € F,[x]. In Exercise 4 you will show that 
A(fp) € Fp is the congruence class of A(f) modulo p. Thus 


fp is separable over F, <=> A(f) #0 mod p. 
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It follows that f, is separable over F, if and only if p ¢ {2,3,5,557}. From here, 
one easily finds that 


tos p= 2, 3, 
x7 43 p=5 
5.14 cd(f,, f,) = ; > 
( Bcd (fp ty) x+257, p=557, 
1, otherwise. 
You will compute a similar example in Exercise 5. LP 


Mathematical Notes 
Our treatment omits many interesting results about separability. 
« Separable Extensions. Here are some conditions that imply separability. 


Theorem 5.3.15 

(a) fL=F(q,... , Gy), where each a; is separable over F, then F C Lis separable. 
(b) If F C Lis the splitting field of a separable polynomial, then F C L separable. 
(c) [fF C K and K C Lare separable extensions, then F C L is separable. a 


We will defer our proof of part (a) until Chapter 7, since it uses some ideas from 
Galois theory. (For a proof that doesn’t use Galois theory, see Corollaries 1 and 3 of 
({Garling, Sec. 10.2].) In Exercise 6 you will show that part (b) follows from part (a). 
The proof of part (c) requires the concept of separable degree, which is discussed in 
[Grillet, Sec. 7.2]. 


= The Structure of Irreducible Polynomials. Although irreducible polynomials are 
separable in characteristic 0, things are more complicated in characteristic p. In this 
case, irreducible polynomials are built from separable ones as follows. 


Proposition 5.3.16 Let F have characteristic p, and let f € F(x| be irreducible. 
Then there is an integer e > 0 and a separable, irreducible polynomial g € F |x| such 
that f(x) = g(x*’). a 


You will prove this in Exercise 7. 


= Purely Inseparable Extensions. If an algebraic extension F C L is not separa- 
ble, then some (but not necessarily all) elements of L have nonseparable minimal 
polynomials. Here is a simple example. 


Example 5.3.17 Suppose that k has characteristic 3, and let t,u be variables. Con- 
sider F = k(t,u), and let F C L be the splitting field of f = (x? —t)(x? —u). Thus L 
contains elements a, 8 such that a* = t and 6? = u. In Exercise 8 you will prove the 
following: 


e The minimal polynomial of a over F is x? —t, which is separable. Thus a is 
separable over F. 
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e The minimal polynomial of 8 over F is x? — u, which is not separable (remember 
that F has characteristic 3). Hence ( is not separable over F. 


Thus some elements of L are separable over F while others are not. <p 


However, some extensions have very few separable elements. Given F C L, 
it is clear that every a € F is separable over F. Thus we say that an algebraic 
extension F C L is purely inseparable if no element of L \ F is separable over F. For 
example, you will prove in Exercise 9 that the extension of Example 5.3.11 is purely 
inseparable. 

In general, if F C L is purely inseparable, then the minimal polynomial of a € L 
is of the form x?" — a for some e > 0 and a € F. This implies that the degree of a 
finite purely inseparable extension is a power of p (see Exercise 10). 

Returning to the case of an arbitrary algebraic extension F C L in characteristic p, 
one can “separate” the separable elements from the inseparable ones. More precisely, 
one can prove the existence of a unique intermediate field F C K C L such that K 
is separable over F and L is purely inseparable over K. A proof can be found in 
Section 8.7 of [Jacobson, Vol. IT]. 


« The Squarefree Decomposition of a Polynomial. Proposition 5.3.8 shows that 
if F has characteristic 0 and f € F [x], then g = f/gcd(f, f’) € F [x] is separable and 
has the same roots as f. This means that 


(5.15) f =gh, 


where g is separable and every root of h has multiplicity at least 2 as a root of f. In this 
situation, we call g the squarefree part of f, and (5.15) its squarefree decomposition. 
Squarefree decompositions also exist when F has characteristic p (this is proved in 
(2, Tutorial 5, pp. 37—38]). The difference is that in characteristic p, the squarefree 
part g need not have the same roots as f (can you give an example?). See also 
Exercise 11. 


Exercises for Section 5.3 


Exercise 1. Prove (5.6). 


Exercise 2. Let F have characteristic p, and suppose that a, 8 € F. Lemma 5.3.10 shows that 
(a+ 8)? =a? + BP. 

(a) Prove that (a — 8)? = a? — B? ifa, BE F. 

(b) Prove that (a+ 8)” =a” +A” for alle >0. 


Exercise 3. Let F be a field of characteristic p. The nth roots of unity are defined to be the 
roots of x” — 1 in the splitting field F C L of x" —1. 

(a) If p{n, show that there are n distinct nth roots of unity in L. 

(b) Show that there is only one pth root of unity, namely 1 € F. 


Exercise 4. Let f € Z[x] be monic and nonconstant and have discriminant A(f). Then let 
fp € F,[x] be obtained from f by reducing modulo p. Prove that A(f,) € F, is the congruence 
class of A(f). 
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Exercise 5. For f = x’ +x-+ 1, find all primes for which f, is not separable, and compute 
gcd( fp, fp) as in (5.14). 


Exercise 6. Use part (a) of Theorem 5.3.15 to show that the splitting field of a separable 
polynomial gives a separable extension. 


Exercise 7. Suppose that F is a field of characteristic p. The goal of this exercise is to prove 

Proposition 5.3.16. To begin the proof, let f € F[x] be irreducible. 

(a) Assume that f’ is not identically zero. Then use the argument of Lemma 5.3.5 to show 
that f is separable. 

(b) Now assume that f’ is identically zero. Show that there is a polynomial g; € F [x] such 
that f(x) = g1(x”). 

(c) Show that the polynomial g; of part (b) is irreducible. 

(d) Now apply parts (a)-(c) to gi repeatedly until you get a separable polynomial g, and 
conclude that f(x) = g(x”) where e > 0 and g € F [x] is irreducible and separable. 


Exercise 8. Let F = k(t,u) and f = (x? —1)(x? — uv) be as in Example 5.3.17. Then the 
splitting field of f contains elements a, § such that a? = and 6? = u. 
(a) Prove that x? — 1 is the minimal polynomial of a over F. Also show that x? —t is separable. 
(b) Similarly, prove that x? — wis the minimal polynomial of 8 over F, and show that xu 
is not separable. 


Exercise 9. Let F be a field of characteristic p, and consider f = x” —a € F [x]. We will 
assume that f has no roots in F, so that f is irreducible by Proposition 4.2.6. Let a be a root 
of f in some extension of F. 

(a) Argue as in Example 5.3.11 that F(a) is the splitting field of f and that [F (a): F] = p. 
(b) Let 6 € F(a) \ F. Use Lemma 5.3.10 to show that 8? € F. 

(c) Use parts (a) and (b) to show that the minimal polynomial of 8 over F is x? — B?. 

(d) Conclude that F C F(a) is purely inseparable. 


Exercise 10. Suppose that F has characteristic p and F C Lis a finite extension. 

(a) Use Proposition 5.3.16 to prove that F C Lis purely inseparable if and only if the minimal 
polynomial of every a € L is of the form x” — a for some e > 0 anda € F. 

(b) Now suppose that F C L is purely inseparable. Prove that [L: F] is a power of p. 


Exercise 11. Let f € F[x] be nonconstant. We say that f is squarefree if f is not divisible by 
the square of a nonconstant polynomial in F [x]. 
(a) Prove that f is squarefree if and only if f is a product of irreducible polynomials, no two 
of which are multiples of each other. 
(b) Assume that F has characteristic 0. Prove that f is separable if and only if f is squarefree. 


Exercise 12. Prove that f € F[x] is separable if and only if f is nonconstant and f and f’ have 
no common roots in any extension of F. 


Exercise 13. Let F have characteristic p, and let F C L be a finite extension with p{(L: F]. 
Prove that F C Lis separable. 


Exercise 14. Let F C K C Lbe field extensions, and assume that L is separable over F. Prove 
that F C K and K C L are separable extensions. Note that this is the converse of part (c) of 
Theorem 5.3.15 


Exercise 15. Let f be the polynomial considered in Example 5.3.9. Use Maple or Mathematica 
to factor f and to verify that the product of the distinct irreducible factors of f is the polynomial 
given in (5.10). 
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Exercise 16. Let F have characteristic p and consider f = x? —x+a € F[x]. 

(a) Show that f is separable. 

(b) Let a be a root of f in some extension of F. Show that a+ 1 is also a root. 

(c) Use part (b) to show that f splits completely over F (a). 

(d) Use part (a) of Theorem 5.3.15 to show that F C F(a) is separable and normal. 
In Exercise 5 of Section 6.2 you will show that if f is irreducible over F, then Gal(F(a)/F) x 
Z/pZ, generated by the automorphism sending a to a+1. This is related to a theorem of 
Artin and Schreier, which states that in characteristic p, every separable, normal extension of 
degree p is the splitting field of an irreducible polynomial of the form x? — x +a. 


Exercise 17. Let £ be a root of a polynomial f. 

(a) Assume that f(x) = (x— 8)"h(x) for some polynomial h(x), and let f“” denote the mth 
derivative of f. Prove that f°”) (8) = m!h(). 

(b) Assume that we are in characteristic 0. Prove that 3 has multiplicity m as a root of f if 
and only if f(8) = f’(8) =-+- =f" (8) =O and f™ (8) £0. 

(c) Assume that we are in characteristic p. How big does p need to be relative to m in order 
for the equivalence of part (b) to be still valid? 


5.4 THEOREM OF THE PRIMITIVE ELEMENT 


Of the extension fields F C L studied so far, the nicest case is when L = F(a) for 
some a@ € L. When this happens, we say that a is a primitive element of F C L. 
In this section, we will show that many but not all finite extensions have primitive 
elements. 

Here is the Theorem of the Primitive Element. 


Theorem 5.4.1 Let F C L = F(ay,...,@n) be a finite extension, where each «x; is 
separable over F. Then there is a € L separable over F such that L = F(a). 
Furthermore, if F is infinite, then a can be chosen to be of the form 


= 10, +++ thn 
where ty,...,tn € F. 


Proof: First assume that F is infinite and that L = F(aj,...,Q@n), where each a; is 
separable over F. We will use induction on n to show that there are 4),...,f, € F 
such that L = F(t)a) +---+t,@,) and tyay +-+-++t,Q, is separable over F. 

We begin with the case n = 2. Given L = F(G,7¥), let f,g € F |x] be the minimal 
polynomials of 8,-y, respectively, and set £ = deg(f), m = deg(g). Ina splitting field 
of fg, the separability of 8,-y implies that 


f has distinct roots 6 = £1, B2,..., Be, 
g has distinct roots Y = Y,, You ++ +s Yn" 


Since F is infinite, we can find A € F such that 


Bi - By 


AF forl<ri<€,1<s,j<m sj. 


V7 7; 
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This easily implies that 

(5.16) B+ ds # Bitdy for (7,5) # (iJ). 

In particular, since 8 = 6, and y = 7,, we have 

(5.17) B+rAY AB tA, forl<i<l,2<j<m. 


We first prove that F(8 + Ay) = F(8,7). Since F(8 +A) C F(8,7) is obvious, 
it suffices to show that 8,7 € F(8+y). We begin with y. Observe that 
e g(x) vanishes at ¥ and lies in F[x] C F(8 + Ay) [x]; 
e f(8+Ay-—Ax) vanishes at y (check this!) and also lies in F(6 + Ay) |x]. 
Our strategy will be to study the greatest common divisor of the polynomials g(x) 
and f(8 + Ay — Ax). We first note that if the gcd were 1, then 


A(x)g(x) + B(x) f(B + Ay — Ax) = 1 


for some A,B € F(8+A+)|x]. By the above bullets, evaluating this at x = y would 
give 0 = 1. Hence 


h(x) = ged(g(x), f(B + Ay — Ax)) € F(B+A7) [a] 


has degree at least 1. If the degree were > 1, then A(x)|g(x) implies that for some 
2<j<m, oF would be a root of (x) (do you see how this uses the separability of 
g?). But since A(x)|f(8 + Ay — Ax), 7; must also be a root of f(B+rAy— Ax), ie. 
f(B+Ay- ry;) = 0. Since the roots of f are 6 = 8,...,8¢, this implies 


B+dAy—-Ay;= 6; forsome 1 <i< Z, 


which contradicts (5.17). Hence A has degree 1, and then # = x — + follows, since 
7 is a root. But we also know h € F(8 + 7)[x], so that y € F(G+.y). Then 
B=(B+ Ay)-A-7 € F(B +7) follows immediately, since  € F. This completes 
the proof that F(8,-y) = F(6+ 7). 

Next let p € F[x] be the minimal polynomial of 6 + Ay over F. We need to show 
that p is separable. For this purpose, consider 


m 


(5.18) s(x) =] [ f@-y). 


j=l 


Note that 8 + Ay is a root of s, since 8 = §;. Furthermore, since f € F[x], A € F, 
and 7¥,,---%_ are the roots of g € F[x], one can easily show that s € F[x] using the 
techniques used in the proofs of Theorem 3.2.4 and Proposition 5.2.1. We leave the 
details as Exercise 1. It follows that p divides s in F |x]. However, we also have 
f =(x—- B1)--- (x— Be), which when combined with (5.18) gives the formula 


om 
(5.19) s(x) =] [][ @- (647). 


i=1 j=1 
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Then (5.16) implies that s has distinct roots. Hence p is also separable (it divides s), 
which proves that 8 + A+ is separable over F. Letting t; = 1 and ft. = A, we see that 
the theorem is true form = 2. 


Now suppose that n > 2 and that L = F(a),...,Q,), where each a; is separa- 
ble over F. By our inductive hypothesis, we can find t),...,t,_; € F such that 
F(a4,..-,Q@n—-1) = F(ao), where ag = ta) + -+-+t,—1;Q,-1 is separable over F. 
Then 


L=F(qj,...,Qn) = F(a1,..-,Qn-1)(Qn) = F (a0) (Gn) = F (a0, On). 


By the proof for n = 2, we have F(ao,Q,) = F(a9+AQn) for some A € F, where 
ag + Ad, is separable over F. If we set t, = A, then ag + Ady = 110) +--+ + tpn is 
the desired separable primitive element. This completes the proof when F is infinite. 

The proof of the theorem is very different in the case when F is a finite field. We 
will give the argument in Exercise 2. 7 


Here are two situations when the hypotheses of Theorem 5.4.1 are satisfied. 


Corollary 5.4.2 Let F C L be a finite extension. 

(a) If F C Lis separable, then there is a € L such that L= F(a). 

(b) If F has characteristic 0, then there is a € L such that L = F(a). Furthermore, 
if L = F(ay,..-, Qn), then a can be chosen to be of the form 


a= tay tes +han 
where t),...,tn € F. 


Proof: We know that L = F(aj,...,Q@,) since F C L is finite. In part (a), each 
a; is separable since F C L is separable, so we are done by Theorem 5.4.1. For 
part (b), let F have characteristic 0. Then F is infinite and each a; is separable by 
Proposition 5.3.7. Again, we are done by Theorem 5.4.1. 7 


Every field of characteristic 0 contains a copy of Z. In Exercise 3 you will use 
this to show that in the equation @ = f1a1 +--++t,Qp, in part (b) of Corollary 5.4.2, 
we can assume that t),...,f, € Z. This observation is due to Galois. 

In some simple cases, one can explicitly find primitive elements. 


Example 5.4.3 Consider Q C Q(Vv2, V3). In the notation of the proof of Theo- 
rem 5.4.1, we have 8; = V2, B = —V2 for f = x? —2, and y= V3, 2 = ~/3 
for g = x? — 3. Then any \ 4 0 in Q satisfies (5.16). Thus /2 + AV3 is a primitive 
element of Q c Q(V2, V3) for all A € Q \ {0}. <> 


Not all finite extensions have primitive elements. By Corollary 5.4.2, such an 
extension cannot have characteristic 0. Here is an example in characteristic p. 


Example 5.4.4 Let k be a field of characteristic p and let t,u be variables. Consider 
the extension field 


(5.20) F=Kk(t,u) cL, 
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where L is the splitting field of (x? —t)(x? —u) € F [x]. Thus there are a, 6 € L with 
a? =t and 8? =u. By Exercise 4, we have L = F(a, 8) and [L: F] = p’. 

Let us show that (5.20) has no primitive element. Given y € L, we can use 
L=F(a,{) = F{a, f] to write 


¥= > ajja'p!, aij CF, 
ij 
where the sum is finite. Lemma 5.3.10 implies that 
-\P of 
P= (Soaia's’) _ S- alia? BHP, 
ij ij 
and then a? = ¢ and 6? = u give 
ryP = So abtiul €F. 
ij 


Hence 7¥ is a root of x? — y? € F[x], so that [F(y):F] < p. Since [L: F] = p”, we 
have L 4 F(+y) for all y € L. Thus F C L has no primitive element. <p> 


In Exercise 4 you will show that the extension (5.20) is purely inseparable. 
Mathematical Notes 
Theorem 5.4.1 leads to the following question about primitive elements. 


« Existence of Primitive Elements. Corollary 5.4.2 tells us that all finite separable 
extensions have primitive elements. But this is not the full story, since the extension 
F C L= F(a) discussed in Example 5.3.11 is not separable but has a primitive 
element. The following theorem of Steinitz characterizes all finite extensions that 
have primitive elements. 


Theorem 5.4.5 A finite extension F C L has a primitive element if and only if there 
are only finitely many intermediate fields F C K C L. . 


A proof can be found in Section 4.14 of [Jacobson, Vol. I], As an example of this 
result, consider the extension F C L from (5.20). Since this has no primitive element, 
there must be infinitely many fields K such that F C K C L. You will construct an 
infinite collection of such fields in Exercise 5. 


Historical Notes 


We will see in Chapter 12 that Lagrange and Galois knew special cases of the 
Theorem of the Primitive Element. 
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Exercises for Section 5.4 


Exercise 1. Use the hints given in the text to prove that (5.18) has coefficients in F. 


Exercise 2. Let F be a finite field, and let F C L be a finite extension. We claim that there is 
a € Lsuch that L = F(c) and a is separable over F. 
(a) Show that L is a finite field. 
(b) The set L* = L\ {0} is a finite group under multiplication and hence is cyclic by Propo- 
sition A.5.3. Let a € L* be a generator. Prove that L = F(a). 
(c) Letm=|L|— 1. Show that a’ is a root of x” — 1 € F [x] for all 0 < i < m—1, and conclude 
that 
x” —1=(x—1)(x—a)(x—-07)---(x-"""'). 


(d) Use part (c) to show that a is separable over F. 


Exercise 3. In the equation a = tha; +--++¢,0n in part (b) of Corollary 5.4.2, show that we 
can assume that t),...,t, € Z. 


Exercise 4. In the extension F C L of Example 5.4.4, we have F = k(t,u), where k has 
characteristic p and L is the splitting field of (x? —t)(x? —u) € F[x]. We also have a, 8 € L 
satisfying a? =t, 6? =u. Prove the following properties of F C L: 

(a) L=F(a,) and [L: F] = p’. 

(b) [F(y):F] =p forally EL\F. 

(c) F C Lis purely inseparable. 


Exercise 5. Let F C L = F(a, 8) be as in Exercise 4, and consider the intermediate fields 
F C F(@+Af8) C Las X varies over all elements of F. Suppose that A # pz are two elements 
of F such that F(a+ A) = F(a + pf). 

(a) Show that a, 8 € F(a+Af). 

(b) Conclude that F(a + 8) = F(a, 8), and explain why this contradicts Example 5.4.4. 
It follows that the fields F(a + A), » € F, are all distinct. Since F is infinite, we see that 
there are infinitely many fields between F and L. 


Exercise 6. Explain why the proof of Theorem 5.4.1 implies that F(8 + Ay) = F(8,) when 
-y is separable over F, ( is algebraic over F, and 4 satisfies (5.17). 


Exercise 7. Let F C L= F(au,...,Qn) be a finite extension, and suppose that a1,...,Qn—1 
are separable over F. Prove that F C L has a primitive element. 


Exercise 8. Use Exercise 7 to find an explicit primitive element for F = k(t,u) C L, where k 
has characteristic 3 and L is the splitting field of (x? —t)(x? — u). Note that this extension is 
not separable, by Exercise 8 of Section 5.3. 


REFERENCES 


1. D. Cox, J. Little, and D. O’ Shea, Ideals, Varieties and Algorithms, Third Edition, Springer, 
New York, Berlin, Heidelberg, 2007. 


2. M. Kreuzer and L. Robbiano, Computational Commutative Algebra 1, Springer, New 
York, Berlin, Heidelberg, 2000. 


3. L. Weisner, Introduction to the Theory of Equations, Macmillan, New York, 1938. 


CHAPTER 6 


THE GALOIS GROUP 


In this chapter we will define the Galois group of a finite extension F C L. We will 
then study the Galois group of the splitting field of a separable polynomial and give 
some examples of Galois groups. 


6.1. DEFINITION OF THE GALOIS GROUP 
If L is a field, then an automorphism of L is a field isomorphism o : L > L. We now 
define one of the central objects in Galois theory. 
Definition 6.1.1 Let F C L be a finite extension. Then Gal(L/F) is the set 
{a :L~L| a is an automorphism, o(a) =a for alla € F}. 


In other words, Gal(L/F) consists of all automorphisms of L that are the identity 
on F. The basic structure of Gal(L/F) is as follows. 


Proposition 6.1.2 Gal(L/F) is a group under composition. 


Proof: First suppose that 0,7 € Gal(L/F). Then o7 is the composition 0 07, which 
is an automorphism because 0,7 are. Also, if a € F, then 007(a) = o(7(a)) = 
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a(a) =a, since 0,7 are the identity on F. Hence composition gives an operation on 
Gal(L/F), which is associative by standard properties of composition. 

The identity map 1, : L > Lis an isomorphism that is the identity on F, so that 
1, € Gal(L/F). One easily checks that 001, = 1,00 =0 for all o € Gal(L/F). 
Thus 1, is the identity element of Gal(L/F). 

Finally, any o € Gal(L/F) is an automorphism, which means that its inverse 
o~':L-+Lis also an automorphism. Also, if a € F, then a = o(a), which implies 
a '(a) =07'(a(a)) =a. This shows that o—! € Gal(L/F) and completes the proof 
that Gal(L/F) is a group under composition. 7 


Because of this proposition, we call Gal(L/F) the Galois group of F C L. Inorder 
to compute Galois groups, we need to know how elements of Gal(L/F) behave. We 
begin with the following simple observation. 


Lemma 6.1.3 Let F C L be finite, and fix o € Gal(L/F). Given h € F[x),...,%n] and 
By,..-5Bn E L, then 


a(h(Bi,..-Bn)) =A(o(B1),---,0(Bn)). 
In particular, if h € F|x| and 8 € L, then 
o(h(8)) = h(o(8)). 


Proof: This follows immediately because o preserves addition and multiplication 
and is the identity on the coefficients of h. a 


This lemma has some nice consequences concerning the Galois group. 


Proposition 6.1.4 Let F C L be a finite extension and let o € Gal(L/F). Then: 


(a) [fh € Fx] is a nonconstant polynomial with a € L as a root, then o(a) is also 
a root of h lying in L. 
(b) IfL = F(ay,...,Qn), then o is uniquely determined by its values on Q,..., Qn. 


Proof: By Lemma 6.1.3, h € F[x] and 0 = A(q) imply that 
0 =0(0) =o(h(a)) =h(o(a)), 


which shows that o(a) € L is also a root of h. Part (a) follows. 


Turning to part (b), note that L = F[ay,...,Qn], since L = F(ay,...,Q,) isa finite 
extension of F. Hence any § € L can be written 
8B =h(ay,...,Qn) 
for some polynomial h € F[x),...,%,]. By Lemma 6.1.3, 
a(8) =a(A(ay,...,Qn)) =h(o(ay),...,0(an)). 
It follows that o : L > L is uniquely determined by a(a)),...,0(Qn). 2 


This proposition leads to our first result on the structure of Gal(L/F). 
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Corollary 6.1.5 Let F C L be a finite extension. Then its Galois group Gal(L/F) is 
finite. 


Proof: Since F C L is finite, L = F(a),...,a,), where each a; is algebraic over 
F. Now suppose that o € Gal(L/F). By part (b) of Proposition 6.1.4, o is uniquely 
determined by a(a),...,0(@,). Furthermore, if p; € F [x] is the minimal polynomial 
of a, then part (a) shows that there are at most deg(p;) possibilities for a(a;). The 
finiteness of Gal(L/F) follows immediately. . 


In Exercise 1 you will show that in the situation of Corollary 6.1.5, one has 
|Gal(L/F)| < deg(p:)---deg(pn). 

Let us now use Proposition 6.1.4 to compute some Galois groups. We begin by 
observing that Galois groups are sometimes unexpectedly small. 


Example 6.1.6 Consider the extension 
QcL=Q(v2) 


studied in Example 5.2.2. The minimal polynomial of W/2 over Q is x? — 2, which has 
roots W/2, ww/2, w*W/2, where w = e?*'/3. The last two are not real and hence can’t 
lie in L. Hence every o € Gal(L/Q) must satisfy o(W/2) = V2. Since o is uniquely 
determined by o(/2), it must be the identity. Thus Gal(L/Q) = {1,}. Do you see 
how this argument uses both parts of Proposition 6.1.4? <> 


Example 6.1.7 Let F = k(t), where k is a field of characteristic p, and let F C L be 
the splitting field of f =x? —t € Fly]. If a € Lisa root of f, then L = F(a) and 
f = (x-a)? by Example 5.3.11. Thus a is the only root of f. Arguing as in the 
previous example, we see that Gal(L/F) = {1}. <> 


Here are some examples where the Galois group is nontrivial. 


Example 6.1.8 Let 7 : C > C be complex conjugation, ie., 7(z) =Z for z€ C. 
By (A.5), we know that 7 is a homomorphism of fields, and it is an automorphism 
because 707 is the identity. Furthermore, we have 7(a) = a for all a € R, so that 
Tt € Gal(C/R). Thus Gal(C/R) has at least two elements, since 1¢ € Gal(C/R). 

However, we also know that C = R(i). Since the roots of x? +1 are +i, Propo- 
sition 6.1.4 implies that o € Gal(C/R) is determined uniquely by o(i) = +i. Hence 
Gal(C/R) has at most two elements. Combining this with the previous paragraph, 
we conclude that 

Gal(C/R) = {1c,7}. 


It follows that Gal(C/R) ~ Z/2Z. <P 


Example 6.1.9 Next consider the extension Q C L = Q(V2). Arguing as in the 

previous example shows that o € Gal(L/Q) is determined uniquely by o(/2) = 

+/2. Thus |Gal(L/Q)| < 2. There are two ways to see that equality occurs: 

e By explicit computation, one can show that o(a+bV2) = a— by2 is an auto- 
morphism of L. 
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e L=Q(¥V2) is the splitting field of x? — 2 over Q. Since x? — 2 is irreducible over 
Q and +/2 € L, Proposition 5.1.8 implies that there is an automorphism of L that 
takes V2 to —/2 and is the identity on Q. <p> 


Our last example will appear often in this chapter. 


Example 6.1.10 For the extension Q c L = Q( V2, V3), Proposition 6.1.4 implies 
that o € Gal(L/Q) is determined uniquely by 


(6.1) o(V2) =4V2, (V3) =+Vv3. 


This gives the inequality |Gal(Z/Q)| <4. The natural question is whether all possible 
sign combinations in (6.1) actually occur, i.e., whether |Gal(L/Q)| = 4. In Exercise 2 
you will prove this using Proposition 5.1.8 as in the previous example. We will learn 
a much quicker method in Section 6.2. <p> 


Finally, we study what happens when we go to an isomorphic field. 


Proposition 6.1.11 Suppose that F C L, and F C Ly are finite extensions, and let 
yp: Ly > Ly be an isomorphism that is the identity on F. Then the map sending a to 
poaow! defines a group isomorphism 


Gal(L, /F) ~ Gal(L2/F). 
Proof: You will prove this in Exercise 3. . 


Proposition 6.1.11 shows that isomorphic fields give isomorphic Galois groups. 
We use this as follows. 


Definition 6.1.12 Let f € F [x]. The Galois group of f over F is Gal(L/F), where L 
is a splitting field of f over F. 


To check that Definition 6.1.12 makes sense, suppose that L; and Ly, are splitting 
fields of f € F[x]. Corollary 5.1.7 implies L; ~ L2 via an isomorphism that is the 
identity on F, and hence Gal(L; /F) ~ Gal(L2/F) by Proposition 6.1.11. Thus the 
Galois group of f over F is well defined up to isomorphism. 

Using this terminology, Example 6.1.8 tells us that the Galois group of x? + 1 over 
R is Z/2Z. 


Historical Notes 


The definition of Galois group given here is very different from the one given by 
Galois. He only dealt with splitting fields, and for him, the Galois group consisted 
of certain permutations of the roots. We will give Galois’s definition and explore its 
relation to Definition 6.1.1 in Chapter 12. 

Isomorphisms of fields were first defined by Richard Dedekind in 1877 under the 
name “permutations.” Here is his definition from [1, pp. 108-109]: 
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Now let 2 be any field. By a permutation of Q we mean a substitution which 
changes each number 


a, 8, a+ B, a- B, af, a/B 


of 2 into a corresponding number 


a’, B', (a+8)’, (a—B)', (aB)’, (a/B) 
in such a way that 


(a+6)' =a' +8! 
(aB)’ = a’ B’ 


are satisfied and the substitute numbers a’, 6’,... are not all zero. We shall see 
that the set 9’ of the latter numbers forms a new field, . . . 


In Exercise 4 you will show that this implies that the map 2 > 2’ given by a> a’ 
is an isomorphism of fields. 

By 1894 Dedekind was also aware of the relevance of automorphisms to Galois 
theory. Dedekind’s influence can be seen in the work of Heinrich Weber, who gave a 
careful account of group theory and Galois theory in the first volume of his Lehrbuch 
der Algebra, which appeared in 1894. In this book Weber begins with Galois’s 
definition of the Galois group and shows how this leads to automorphisms of the 
splitting field. 

The final step in the evolution of the Galois group is due to Emil Artin, who 
during the 1920s made Definition 6.1.1 the starting point of Galois theory. The 
first exposition of this approach appeared in the 1930 edition of [van der Waerden]. 
Artin published his own account of Galois theory in 1938 and 1942. The latter was 
enormously influential and is still in print as [Artin]. See [2] for more details. 


Exercises for Section 6.1 


Exercise 1. Let L = F(a1,...,@n), and let p; € F [x] be a nonzero polynomial vanishing at aj. 
Explain why the proof of Corollary 6.1.5 implies that |Gal(L/F)| < deg(p1)---deg(pz). 


Exercise 2. Consider the extension Q C L = Q( V2, V3). In Exercise 13 of Section 5.1, you 
used Proposition 5.1.8 to construct an automorphism of L that takes /3 to —V/3 and is the 
identity on Q(Vv2). By interchanging the roles of 2 and 3 in this construction, explain why all 
possible signs in (6.1) can occur. This shows that |Gal(L/Q)| = 4. 


Exercise 3. This exercise will prove a generalized form of Proposition 6.1.11. 

(a) Let ~: Li ~ Lz be an isomorphism of fields. Given a subfield Fy C Li, set h = y(Fi), 
which is a subfield of Lz. Prove that the map sending o € Gal(L:/Fi) to poooy™' 
induces an isomorphism Gal(L; /Fi) ~ Gal(L2/F2). 

(b) Explain why Proposition 6.1.11 follows from part (a). 


Exercise 4. In the Historical Notes, we saw that Dedekind defined a “permutation” a +> a’ to 
be a map 2 — 1’ satisfying (a+ 8B)’ = a’ + ’ and (af)' = a’ f’ for all a, 8B € Q. Dedekind 
also assumes that 2’ = {a’ | a € 2} and that the a’ are not all zero. 
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(a) Show that 1 € 2 maps to 1 € 9’. Once this is proved, it follows that a> a’ is a ring 
homomorphism. (Recall that sending | to 1 is part of the definition of ring homomorphism 
given in Appendix A.) 

(b) Show that the map a +> a’ is one-to-one. 

This shows that Dedekind’s definition of field isomorphism is equivalent to ours. 


Exercise 5. Prove the following inequalities: 
(a) |Gal(Q(V2, V3, V5)/Q)| < 8. 


(b) |Gal(Q(/p1,.--.</Pn)/Q)| < 2”, where pi,..., pn are the first n primes. 
In each case, one can show that these are actually equalities. 


Exercise 6. If we apply Exercise 1 to the extension Q C L = Q(V6, v10, v15), we get the 
inequality |Gal(L/Q)| < 8. Show that |Gal(L/Q)| < 4. 


Exercise 7. Let F C L be a finite extension, and let a : L > L be a ring homomorphism that 
is the identity on F. This exercise will show that o is an automorphism. 

(a) Show that o is one-to-one. 

(b) Show that o is onto. 


6.2 GALOIS GROUPS OF SPLITTING FIELDS 


In this section we will study the Galois group of the splitting field of a separable 
polynomial. Recall from Section 5.3 that f € Fx] is separable if it has distinct roots 
in a splitting field. This is the situation considered by Galois. 

We now prove the first main theorem of Galois theory. 


Theorem 6.2.1 If L is the splitting field of a separable polynomial in F |x], then the 
Galois group of F C L has order |Gal(L/F)| = [L: F). 


Proof: Our hypothesis implies that L = F(a),...,@,), where Q1,...,Q, are the 
roots of a separable polynomial f € F |x]. Then each a; is separable over F (be sure 
you can explain why). By the Theorem of the Primitive Element (Theorem 5.4.1), 
we can find @ € L separable over F such that L = F(8). Let h € F[x] be the minimal 
polynomial of G. Note that / is separable, since £ is. 

Since L = F (3), Proposition 4.3.4 implies that [L: F] =m, where m = deg(h). To 
prove the theorem, we need to show that Gal(Z/F) has m elements. We will use the 
following ideas from Chapter 5: 


e Normality (Section 5.2): If an irreducible polynomial has one root in a splitting 
field, then all of its roots lie in the splitting field. 

e Separability (Section 5.3): Separability means that a polynomial has distinct roots 
in its splitting field. 

e Isomorphisms (Proposition 5.1.8): If two elements in a splitting field Z are roots 
of the same irreducible polynomial over F, then there is an automorphism of L 
that is the identity on F and takes one root to the other. 


As we will now explain, the theorem follows easily from these ideas. 


The above polynomial A € F|x] is separable and has a root 6 € L. Since L is 
a splitting field over F, the bullets for normality and separability imply that / has 
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distinct roots 8 = 61, f2,..-,8m, m = deg(h), all of which lie in L. Now fix one of 
the roots, say 6;. Then @ and §; are roots of the irreducible polynomial A. Since 
L is a splitting field over F, the bullet for isomorphisms implies that there is an 
automorphism o; of L such that o;(8) = §; and o; is the identity on F. 

It follows that o1,...,0m € Gal(L/F). Note that 0; 4 0; fori 4 j, since o;(8) = 
6; # 8; =0;(8). Thus Gal(L/F) has at least m distinct elements. But given any 
o € Gal(L/F), Proposition 6.1.4 and L = F() imply that o is uniquely determined 
by (8) € {f1,..., 8m}. It follows that o = 0; for some i. This completes the proof 
of the theorem. | 


The following example illustrates the power of the theorem just proved. 


Example 6.2.2 Consider Q C L = Q(V2, V3). In Example 6.1.10, we saw that 
o € Gal(L/Q) is uniquely determined by 


o(V¥2) =4V2, o(V3) =+V3, 


which implies that |Gal(L./Q)| <4. We also asked whether equality occurs. 
This is now easy to decide, for [L: Q] = 4 by Example 4.3.9 and L is the splitting 
field of the separable polynomial (x? — 2)(x? —3). Hence all of the above sign 


combinations must occur. In particular, we can find o,7 € Gal(L/Q) such that 
(6.2) 


In Exercise | you will show that Gal(L/Q) = {1z,0,7,07} ~ Z/2Z x Z/2Z (this is 
usually called the Klein four-group). <p 


It is important to understand why the hypotheses splitting field and separable are 
necessary in the proof of Theorem 6.2.1. We can see this in the first two examples 
considered in Section 6.1: 

e Consider Q c Q(v2). The Galois group is trivial by Example 6.1.6. This 
extension is not a splitting field, by Example 5.2.2. 

e Consider F = k(t) C L, where k has characteristic p and L is the splitting field of 
f =x? —t. The Galois group is trivial by Example 6.1.7. This polynomial is not 
separable by Example 5.3.11. 

In both of these examples, note that |Gal(L/F)| < [L: F]. This is no accident, for in 

Section 7.1 we will prove that |Gal(L/F)| < [L:F], with equality if and only if L is 

the splitting field of a separable polynomial in F' [x]. Such extensions will be called 

Galois extensions in Chapter 7. 


Exercises for Section 6.2 


Exercise 1. Complete Example 6.2.2 by showing that Gal(L/Q) = {11,0,7,07} and that 
Gal(L/Q) ~ Z/2Z x Z/2Z. 
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Exercise 2. Consider Q C L = Q(w, 72), where w = e?™/?, 
(a) Explain why o € Gal(L/Q) is uniquely determined by o(w) € {w,w?} and o(W/2) € 
{ 72, w/2, w?V/2}. 
(b) Explain why all possible combinations for o(w) and o(¥/2) actually occur. 
In the next section we will show that Gal(L/Q) ~ $3. 


Exercise 3. Consider Q C L = Q(¢,, /2), where ¢, = e?"'/5. By Proposition 4.2.5, the 
minimal polynomial of ¢, over Q is 4x4 txt]. 
(a) Show that [L: Q] = 20. 
(b) Show that L is the splitting field of x* — 2 over Q, and conclude that Gal(L/Q) is a group 
of order 20. 
We will describe the structure of this Galois group in Section 6.4. 


Exercise 4. Consider the nth root of unity ¢, = e?"!/". We call Q C Q(¢,) a cyclotomic 
extension of Q. 

(a) Show that Q C Q(¢,) is a splitting field of a separable polynomial. 

(b) Given o € Gal(Q(¢,)/Q), show that o(¢,) = Ci for some integer i. 

(c) Show that the integer i in part (b) is relatively prime to n. 

(d) The set of congruence classes modulo n relatively prime to n form a group under mul- 
tiplication, denoted (Z/nZ)*. Show that the map o ++ [i], where o(¢,) = Cj, defines a 
one-to-one group homomorphism Gal(Q(¢,)/Q) — (Z/nZ)". 

(e) The order of (Z/nZ)* is |(Z/nZ)*| = o(n), where o(n) is the Euler ¢-function from 
number theory. Prove that the homomorphism of part (d) is an isomorphism if and only 
if [Q(¢,) :Q] = o(). 

(f) Let pbe prime. Use part (e) and Proposition 4.2.5 to show that Gal (Q(¢,)/Q) ~ (Z/pZ)*. 

In Chapter 9 we will prove that [Q(¢,) : Q] = ¢(n). By part (e), this will imply that there is an 
isomorphism Gal(Q(¢,)/Q) ~ (Z/nZ)* for all n. 


Exercise 5. Let F have characteristic p, and assume that f = x? —x+a € F|x] is irreducible 
over F. Then let L = F(a), where a is a root of f in some splitting field. In Exercise 15 of 
Section 5.3, you showed that F C Lis a normal separable extension. 
(a) Show that |Gal(L/F)| = p, and use this to prove that Gal(L/F) ~ Z/pZ. 
(b) Exercise 15 of Section 5.3 showed that a+ 1 is a root of f. Fori=0,...,p— 1, show 
that there is a unique element of Gal(L/F) that takes a to a+i. 
(c) Use part (b) to describe an explicit isomorphism Gal(L/F) ~ Z/pZ. 


Exercise 6. Let f € F[x] be irreducible and separable of degree n, and let F C L be a splitting 
field of f. Prove that n divides |Gal(L/F)|. 


6.3 PERMUTATIONS OF THE ROOTS 


In Chapter | we saw that permutations of the roots of a cubic arise naturally from 

Cardan’s formulas. In fact, the title of Section 1.2 was “Permutations of the Roots.” 

We now explain more generally how Galois groups relate to permutations. As in the 

previous section, we assume that L is the splitting field of a separable polynomial 

f € F[x]. Our goal is to interpret Gal(L/F) in terms of permutations of roots of f. 
Let n = deg(f). Then in L[x] we can write f as the product 


f =ao(x—-—a1)++-(x-a,y), 
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where ap # 0 and a,...,a, € L are distinct. In this situation we get a map 
(6.3) Gal(L/F) — S, 


as follows. Given o € Gal(L/F), Proposition 6.1.4 implies that o(a;) is a root of f 
(since a; is), so that o(a;) = a, for some r(i) € {1,...,n}. Note that r(i) is 
uniquely determined, since a1,...,@, are distinct. Also, 


r:{l,...,.n} 3 {l,...,n} 


is one-to-one since a is (be sure you see why). It follows that 7 is a permutation, i-e., 
T € S,. This defines the map (6.3). 


Proposition 6.3.1 The map Gal(L/F) — S,, described in (6.3) is a one-to-one group 
homomorphism. 


Proof: Suppose that o,02 € Gal(L/F) correspond to 7;,7 € S, via (6.3). This 
means that o;(a;) = @,,(;), and similarly for 02 and 72. Then 


con 0.02(a;) =0d0; (o2(ai)) = 01(Qz,(i)) = On, (7(i)) = Ar 72(i)- 


This shows that 0; 0 a2 corresponds to 7,72, so that (6.3) is a group homomorphism. 
It remains to show that (6.3) is one-to-one. This follows immediately from 
Proposition 6.1.4, since L = F(ay,...,Q@,). The proof is now complete. 7 


Proposition 6.3.1 shows that for the splitting field of a separable polynomial of 
degree n, we can regard the Galois group as a subgroup of S,. By Lagrange’s Theorem, 
it follows that |Gal(L/F)| divides n!. Combining this with [L: F] = |Gal(L/F)| from 
Theorem 6.2.1, we get the following corollary. 


Corollary 6.3.2 If L is the splitting field of a separable polynomial f € F |x], then 
[L: F| divides n!, where n = deg(f). a 


Theorem 5.1.5 states that [L: F] <n! when L is the splitting field of f € F [x] of 
degree n. Do you see how Corollary 6.3.2 refines this result when f is separable? 
Here are some examples of Proposition 6.3.1. 


Example 6.3.3 We know that the splitting field of f = (x? — 2)(x? — 3) over Q is 
L=Q(v2, V3). Example 6.2.2 shows that Gal(L/Q) = {1z,0,7,07}, where o and 
T satisfy 


o(V2) = V2, o(V3) =-V3 and (V2) = —V2, 7(V3) = V3. 


Let a; = 72,02 = —V2,03 = V3, and a4 = —/3. Then Gal(L/Q) is isomorphic 
to a subgroup of S4 by Proposition 6.3.1. The automorphism o fixes a1,a@2 and 
interchanges 3,04. Hence a +> (34) € Sy. One similarly shows that 7 +> (12), so 
oT ++ (34)(12) = (12)(34). Thus Gal(L/Q) ~ {e, (12), (34), (12)(34)} C Sy. <> 
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Example 6.3.4 Consider the extension Q C L = Q(w, 72), w = e?"/3. Since L is 
the splitting field of x? ~2 over Q (Exercise 1 of Section 5.1), we get a one-to-one 
group homomorphism Gal(L/Q) — S3. However, we learned in Example 4.3.10 that 
[L:Q] = 6. Since [L:Q] = |Gal(L/Q)|, it follows that Gal(L/Q) ~ $3. You will 
work out the details of this isomorphism in Exercise 1. <p> 


When one thinks of Galois groups in terms of permutations, it makes sense to 
ask how properties of the permutations relate to properties of the corresponding field 
extension. One nice example of this involves the following subgroups of S,,. 


Definition 6.3.5 A subgroup H C S, is transitive if for every pair of elements i, j € 
{1,...,}, there is + € H such that r(i) = j. 


For example, S,, is a transitive subgroup of itself, since the transposition (ij) takes 
ito j. But not all subgroups of S,, are transitive. 


Example 6.3.6 The subgroup {e, (12), (34), (12)(34)} C Sq from Example 6.3.3 is 
not transitive, since no element of the subgroup takes | to 3. <p 


It is natural to ask if the subgroup of S, corresponding to Gal(L/F) is transitive. 
This question was answered by Camille Jordan in 1870 as follows. 


Proposition 6.3.7 Let L be the splitting field of a separable polynomial f € F |x| of 
degree n. Then the subgroup of S, corresponding to Gal(L/F) is transitive if and 
only if f is irreducible over F. 


Proof: First suppose that f is irreducible with distinct roots a;,...,a, € L. As inthe 
proof of Theorem 6.2.1, we can use Proposition 5.1.8 to construct an automorphism 
o :L~L that takes a; to a; and is the identity on F. Then o € Gal(L/F), and the 
corresponding permutation in S, clearly takes ito j. Thus Gal(L/F) gives a transitive 
subgroup of S,,. 

Conversely, suppose that Gal(L/F) corresponds to a transitive subgroup of S,, 
and let A be an irreducible factor of f. We will show that deg(h) > n, which easily 
implies that f is irreducible (do you see why?). 

For this purpose, let the roots of f be aj,...,@, € L. Since A is a nonconstant 
factor of f, we can find i such that h(a;) = 0. Now pick any j € {1,...,n}. By our 
transitivity assumption, there is o € Gal(L/F) such that o(a;) = aj. Since h has 
coefficients in F, part (a) of Proposition 6.1.4 implies that o(a;) = a; is also a root 
of h. Since j was arbitrary and a ,...,@, are distinct, it follows that A has at least n 
roots, which implies that deg(h) > n. 2 


Mathematical Notes 
Here are two topics for further discussion. 


«= The Galois Group of a Polynomial. In Section 6.1, the Galois group of f € F [x] 
was defined to be Gal(L/F), where F C Lisa splitting field of f over F. But when f 
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is separable of degree n, the Galois group of f has extra structure given by its action 
on the roots of f. Hence one can argue that the correct definition of “the Galois group 
of f” is the homomorphism Gal(L/F) — S, studied in this section. 


« Transitive Group Actions. Definition 6.3.5 defines a transitive subgroup of Sp. 
This can be generalized to any group action: if a group G acts on a set X (as defined 
in Section A.4), then the action is transitive if for all x, y € X, there is g € G such that 
g-x= yy. For example, if L is the splitting field of a separable polynomial f € F [x], 
then Gal(L/F) acts on the roots of f. Hence Proposition 6.3.7 can be restated as 
saying that f is irreducible if and only if Gal(L/F) acts transitively on the roots of f. 


Historical Notes 


Proposition 6.3.1 shows that for the splitting field of a separable polynomial of 
degree n, the Galois group Gal(L/F) is isomorphic to a subgroup of the symmetric 
group S,. The permutations in this subgroup correspond to those permutations that 
respect the algebraic structure of the roots, i.e., those that come from automorphisms 
of the splitting field. 

Galois defined his “group” to consist of certain arrangements of the roots of the 
given polynomial. In Chapter 12 we will show that this set of permutations in S,, is the 
image of Gal(L/F) — S,. Hence his group agrees with ours up to isomorphism. What 
is interesting is that Galois had no notion of automorphism, although automorphisms 
are implicit in his development of the theory. 

In [Galois, p. 79] Galois defines transitive subgroups of S, and gives an example of 
anontransitive subgroup that in modern terms is written ((12), (345)) C Ss. However, 
his terminology is different: He writes “irreducible” instead of “transitive.” This 
shows that he also knew Proposition 6.3.7, though Jordan was the first to state the 
result explicitly. 


Exercises for Section 6.3 


Exercise 1. Consider Gal(L/Q), where L = Q(w, V2), w =e". By Exercise 2 of Sec- 
tion 6.2, there are o,7 € Gal(L/Q) such that 


o(V2) =wV2, o(w)=w and 7(72) = 72, rw) =u. 
Find the permutations in S3 corresponding to o and r. 


Exercise 2, For each of the following Galois groups, find an explicit subgroup of S4 that is 
isomorphic to the group. Also, the Galois group is isomorphic to which known group? (By 
“known groups,” we mean cyclic groups, dihedral groups, the quaternion group, symmetric 
groups, alternating groups, products of these groups, etc. You may need to look up some of 
these in your abstract algebra text.) 

(a) Gal(Q(i, V2)/Q). 

(b) Gal(Q(i, 72)/Q). 


Exercise 3. In the terminology of Exercise 2, Gal(Q(i, V2, V3) /Q) isomorphic to which 
known group? Explain your reasoning in detail. 
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Exercise 4. Consider the extension Q ¢ L = Q(a), where aw = 2+ V2. In Exercise 6 of 
Section 5.1, you showed that f = x* — 4x? + 2 is the minimal polynomial of a over Q and that 
L is the splitting field of f over Q. Show that Gal(L/Q) ~ Z/4Z. 


Exercise 5. Let f € F[x] be separable, where f = g1---g; for g; € F [x] of degree d; > 0, and 
let L be the splitting field of f over F. Show that Gal(L/F) is isomorphic to a subgroup of the 
product group Sg, x --- x Su,. 


Exercise 6. Let H be a transitive subgroup of S,. Prove that |H| is a multiple of n. 


Exercise 7. Let f € F [x] be irreducible and separable of degree n and let F C L bea splitting 
field of f. Use Exercise 6 and Proposition 6.3.7 to prove that n divides |Gal(L/F)|. This gives 
an alternate proof of Exercise 6 of Section 6.2. 


6.4 EXAMPLES OF GALOIS GROUPS 


In this section we will give some interesting examples of Galois groups. 


A. The pth Roots of 2. Let ¢, = e?*/? be a pth root of unity, where p is prime. 
By Section A.2 the roots of x? — 2 are Cw for 0 < j < p—1, so that 


L=Q(72,6,072,0,0/2,...,62 'W72) =Q(G,, 72) 


is the splitting field of x? — 2 over Q. Our goal is to describe Gal(L/Q). 

The minimal polynomial of ¢, over Q is xP—l4...+1 by Proposition 4.2.5 and 
the roots of this polynomial are ¢ for 1 <i< p— 1 by Section A.2. Furthermore, the 
minimal polynomial of (/2 over Q is x? — 2 by the Schénemann-Eisenstein criterion, 
and its roots are listed above. Since p and p — | are relatively prime, the method used 
in Exercise 5 of Section 4.3 implies that [L:Q] = p(p—1). (See also Exercise 8 of 
Section 5.1.) 

It follows from Theorem 6.2.1 that Gal(L/Q) is a group of order p(p—1). To 
see what group this is, let o € Gal(L/Q). Then Proposition 6.1.4 implies that o is 
uniquely determined by 


160) © Lpre-es6h "fs (V2) € {02,6,02,6002,... pV}. 
In other words, there are integers 1 <i < p—1 and 0 < j < p—1 such that 
(6.4) o(¢,) = i (7/2) = v2. 


We will denote this o by o;,;. The number of possible pairs (i,j) is (p — 1)- p= 
p(p — 1). Since this is also the order of Gal(L/Q), it follows that all possible pairs 


(6.5) (if) € (1... P—1} x {0,....p—1} 


must occur in (6.4). 


EXAMPLES OF GALOIS GROUPS 137 


To determine the group structure, we need to compute the composition of o;, ; and 
r,s. This is done as follows: 


i,j 9 Or,s(C,) = 7,5(G,) = (1,;(¢,))” = (¢5)" 


ir 


= cir, 
03,59 Or,(72) = 04,4(65.072) = (01,5(6,))°o1,3(72) = (66) (C472) 
= CBHI), 


This computation suggests that 


0;,j° Ors = Cir,istj- 


Unfortunately, the pair (ir, is + j) need not lie in (6.5). We can resolve this difficulty 
by realizing that for i € Z, Gh depends only on the congruence class of i modulo p. 
In other words, for a = [i] € F, = Z/pZ, the number ¢7 = ¢; is well defined. 

If we set Fy = F, \ {0}, then for 


(a,b) €F, xF,, 
we can define a4, to be the element of Gal(L/Q) such that 


Cab Gp) =62, Fao) = ChB. 


Then the above computation shows that 04,4 ° Oc,¢ = Cac,ad-+b- 

This composition formula leads to a geometric description of the Galois group 
Gal(L/Q). Given a,b € F,, the function 7, ,: F, — F, defined by 7, ,(u) =au+b 
is an affine linear transformation. By Exercise 1, ‘, , is one-to-one and onto if and 
only if a £ 0, and all such +, , form a group of order p(p — 1) under composition. 
This group is called AGL(1,F,,), the one-dimensional affine linear group modulo p. 
To understand its structure, we take u € F, and compute 

Ya,b° Ye,d(4) = Ya,b(Yc,a(4)) = Ya,p(Cu + d) 
=a(cu+d)+b=acu+ (ad+b) = Yao ag+o(4)- 


Thus "Yq 4° Yed = Yac,ad-+b» 80 that the map 04,5 +> a,b gives an isomorphism 
Gal(L/Q) ~ AGL(1,F,). 
Another way to understand the structure of AGL(1,F,) is via the subgroup 
T= {%1p |beF,}. 


In Exercise 2 you will show that there is a group isomorphism T ~ F,. You will also 
prove that T is a normal subgroup of AGL(1,F,) with quotient 


(6.6) AGL(1,F,)/T = Fy. 
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Asa group, F, is cyclic of order p, and Proposition A.5.3 implies that FY is cyclic of 
order p — 1. In the Mathematical and Historical Notes we will say more about how 
AGL(1,F,) is built from these cyclic groups. 


B. The Universal Extension. In Chapter 2 we studied the elementary symmetric 
polynomials o),...,0, in variables x,,...,x,- Recall from Proposition 2.1.4 that 


(x= x1) 00+ (4 Xn) =X" — ox 4H (HL ox He + (-1)" On. 


This is the universal polynomial of degree n introduced in Section 2.2 and denoted 
by f. Note that f is a polynomial in x with coefficients in the field 


K = F(o},...,0n)- 
Since the roots of f are X|,...,Xn, it follows easily that 
L=F(x,.--;Xn) 


is the splitting field of f over K. We call K C L the universal extension in degree n. 
Since f has distinct roots, Section 6.3 gives a one-to-one group homomorphism 
Gal(L/K) — S,. We now prove that this map is an isomorphism. 


Theorem 6.4.1 The universal extension K = F(o\,...,0n) CL = F(x,...,%n) in 
degree n is the splitting field of a separable polynomial. The action of the Galois 
group on the roots of the universal polynomial of degree n gives an isomorphism 


Gal(L/K) ~ S, 


Proof: We showed above that L is the splitting field of the universal polynomial f. 
Notice also that f is separable, since its roots x),...,X, are distinct. 

To prove the final assertion of the theorem, we will use the action of S, on 
F[x1,.--,%n] discussed in Section 2.4. Recall that for f € F[x,...,x,| and 7 € S,, 
T+ f is the polynomial obtained by permuting the variables according to 7. This 
action has the properties 


T(ft+g)=T-f+T-s, 
(6.7) T+(fg) =(7-f)(7-a), 
Ty f= (yf, 


where 7,7 € S, and f,g € F[x,...,X,]. You will prove this in Exercise 3. 

In Exercises 4 and 5 you will also show that f +> 7 - f is a ring isomorphism from 
F|x1,..-,%,] to itself and hence extends to an isomorphism of its field of fractions. 
It follows that permuting the variables according to 7 gives an automorphism of 
L= F(x,...,%,). Since the elementary symmetric polynomials are fixed by the 
action of 7 € S,, this automorphism is the identity on F(o1,...,0n). 

We have thus shown that f +> 7- f is an element of Gal(L/K). Under the map 
Gal(L/K) — S, of Proposition 6.3.1, this automorphism obviously maps to 7. Since 


EXAMPLES OF GALOIS GROUPS 139 


7 was an arbitrary element of S,,, we see that Gal(L/K) — S,, is onto, which completes 
the proof of the theorem. 7 


Chapter 7 will describe the Galois theory of the universal extension. 


C. A Polynomial of Degree 5. Consider the polynomial f = x° — 6x +3, and let L 
be the splitting field of f over Q. The Sch6nemann-Eisenstein criterion implies that 
f is irreducible and hence separable. Thus Gal(L/Q) is isomorphic to a subgroup 
HC Ss. We will show that H = Ss, so that 


(6.8) Gal(L/Q) ~ Ss. 


We will sketch the proof and leave the details for Exercise 6. 

By Exercise 6 of Section 6.2, |Gal(L/Q)| = |H| is divisible by 5, since f is 
irreducible. By Cauchy’s Theorem (Theorem A.1.5) from group theory, 7 must have 
an element g of order 5. Recall that g is a product of disjoint cycles whose order is 
the least common multiple of the lengths of the cycles. Since g is in Ss and has order 
5, one easily sees that g is in fact a 5-cycle. Thus H contains a 5-cycle. 

By the Fundamental Theorem of Algebra, we can assume that L Cc C, so that the 
roots of f can be regarded as complex numbers. Furthermore, using curve graphing 
techniques from calculus, one also sees that f has exactly three real roots. It follows 
that complex conjugation gives an element 7 € Gai(L/Q) that interchanges two of 
the roots and fixes the other three. Since 7 maps to a transposition in Ss, we conclude 
that H contains a transposition. 

Hence, relabeling the roots appropriately, we may assume that H contains (12345) 
and (1i) for some i € {2,3,4,5}. Since (12345)'~! is a 5-cycle beginning (1i...), 
we can relabel the roots again so that H contains (12345) and (12). It is a classic 
result in group theory that these two permutations generate Ss. You probably studied 
this in your abstract algebra course (if not, you should do Exercise 7). This shows 
that H = Ss; and completes the proof of (6.8). 

This example is taken from [Stewart, Chapter 14]. In Chapter 8 we will see that 
f =x° — 6x+3 is not solvable by radicals, since Gal(L/Q) ~ Ss. Different proofs 
of (6.8) will be given in Examples 13.2.8 and 13.4.7 of Chapter 13. 


Mathematical Notes 
This section has several topics of interest to discuss. 
= Specialization of Galois Groups. For the fifth roots of 2, we have 
the splitting field of x° — 2 over Q has Galois group AGL(1,Fs), 


while for the universal extension in degree 5 over F = Q, we have 


the splitting field of ye ox" + onx? _ ox? + 04x — 05 


over Q(o1,02,03,04,05) has Galois group Ss. 
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Since the second polynomial is the universal polynomial of degree 5, the first can be 
regarded as the specialization of the second obtained by the mapping 


0113-0, 020, 0340, 0410, 0552. 


However, the Galois groups are not the same, which implies that specialization does 
not always preserve the Galois group. This is part of what makes Galois theory so 
hard—polynomials of the same degree may have different Galois groups. 

On the other hand, most specializations of the universal polynomial of degree n 
over Q have S,, as their Galois group. This follows from the Hilbert Irreducibility 
Theorem, which is discussed in [Hadlock]. For example, one can prove that the 
Galois group of x” —x— 1 € Qa] is S, for all n > 2 (see [4, p. 42]). 


= Semidirect Products. In the text we noted that the one-dimensional affine group 
AGL(1,F,) has a normal subgroup T ~ F, such that AGL(1,F,)/T ~ F3. We will 
now explain how AGL(1,F,) is the semidirect product of F, and F} via the action of 
F> on F, given by a-u = au. 

In general, let G and H be groups, and assume that G acts on H (as defined in 
Section A.4) in the following special way: for any g € G, the map h+> g-h given 
by the action on g on A is a group homomorphism from H to itself. Then define a 
binary operation on the set H x G by 


(6.9) (h,g)- (h',g') = (A(g-h’), 88’) 


where h(g-h’) is the product of h,g-h' € H. The intuition behind this formula is that 
when we multiply (h, g) and (h’,g’), we “twist” by the action of g, since g is between 
hand h’. In Exercise 8 you will show that this defines a group, called the semidirect 
product Hx G. 

For example, the action of FF on F, gives the semidirect product F, » F}. In this 
group, the product is given by 


(b,a)-(d,c) = (b+a-d,ac) = (ad+b,ac), 


since the group operation is addition in F, and multiplication in F>. It follows that 
"Ya,p > (b, a) gives an isomorphism 


(6.10) AGL(1,F,) ~ F, x F. 


For any semidirect product H x G, the map (h,g) +> g is a group homomorphism 
that is clearly onto. In Exercise 8 you will check that the kernel of this map is 


{(h,e) | hE H}=Ax {e} A. 
Then the Fundamental Theorem of Group Homomorphisms implies that 
(Hx G)/(H x {e}) &G. 


In Exercise 9 you will explore how this relates to (6.6) and (6.10). 
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= The Extension Problem. Given groups G and H, a third group G, is an extension of 
H by G if G, contains a normal subgroup H ~ H such that G,/H, ~ G. For example, 
when G acts on H by group homomorphisms as above, the semidirect product H x G 
is an extension of H by G. 

An important observation is that the same groups can have nonisomorphic exten- 
sions. For example, Exercise 10 will show that the product F, x F> and the semidirect 
product F,, » F> are nonisomorphic extensions of F, by F> when p > 3. 

The extension problem in group theory asks whether it is possible to classify all 
extensions of H by G. This is a difficult problem and is one of the reasons why groups 
are hard to classify. The extension problem is also related to group cohomology. 


Historical Notes 


For the extension Q C L = Q(¢,, 72), one can describe Gal(L/Q) ~ AGL(1,F,) 
in terms of permutations in S,. We do this by replacing {1,..., p } with the congruence 
classes {[1],...,[p]} =F. Then the affine linear transformation +, , defined in the 
text becomes the element of S, represented by the permutation 


1 2 _ Pp 
al+b a2+b ... ap+b 
provided we think of the entries as congruence classes modulo p. This can be 
expressed more succinctly in the form 


(6.11) (ais) 


We will see in Chapter 12 that the permutations (6.11) are implicit in the work of 
Lagrange.These permutations also appear explicitly in Galois’s study of irreducible 
polynomials of prime degree that are solvable by radicals. We will have more to say 
about this in Chapter 14. 

In the late nineteenth and early twentieth centuries the subgroup of S, consisting 
of the permutations (6.11) was called the metacyclic group. These days the term 
metacyclic is used more generally to mean any group G possessing a normal subgroup 
H such that both H and G/H are cyclic. In Exercise 11 you will show that the group 
of permutations of the form (6.11) is metacyclic in this sense. 

As for the Galois group of the universal extension computed in Theorem 6.4.1, 
Galois states this result as follows [Galois, p. 51]: 


In the case of algebraic equations, the group is nothing other than the collection 
of 1.2.3...m possible permutations on the m letters. . . 


Here, “algebraic equation” refers to the universal case where the “m letters” are the 
roots of the universal polynomial of degree m. However, we will see in Chapter 12 
that Galois’s use of the word “permutation” is different from ours. 
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Exercises for Section 6.4 


Exercise 1. Given a,b € Fp, define y, ,: Fp — F, by y, ,(4) = au +b. 
(a) Prove that -y, , is one-to-one and onto if and only if a 4 0. 
(b) Suppose that a # 0. Prove that the inverse function of Ya,b is Y,-1 aap 
(c) Show that 
AGL(1, Fy) = {%,5| (@,6) € Fy x Fp} 


is a group under composition. 


Exercise 2. Consider the map AGL(1,F,) —> Fy defined by y, , +> a. 
(a) Show that this map is an onto group homomorphism with kernel T = {y,, | 0 € Fp}. 
Then use this to prove (6.6). 
(b) Show that T ~ Fp. 


Exercise 3. This exercise is concerned with the proof of (6.7). Given 7 € Sn, observe that 
f t+7-f can be regarded as the evaluation map from F[x1,...,xn] to itself that evaluates 
fm gees Xn) at (x-(1)5 on )Xz(n))s 
(a) Explain why Theorem 2.1.2 implies that f +» 7 - f is a ring homomorphism. This proves 
the first two bullets of (6.7). 
(b) Prove the third bullet of (6.7). 


Exercise 4. Let 7 € S,. Prove that f > 7 - f is aring isomorphism from F [x1,...,xn] to itself. 


Exercise 5. Let R be an integral domain, and let K be its field of fractions. Prove that every 
ring isomorphism @¢ : R — R extends uniquely to an automorphism ¢: K —> K. 


Exercise 6. As in the text, let f = x° —6x+3. 
(a) Use the hints given in the text to show that every element of Ss of order 5 is a 5-cycle. 
(b) Use curve graphing from calculus to show that f has exactly three real roots. 


Exercise 7. Show that S, is generated by the transposition (12) and the n-cycle (12...n). 


Exercise 8. Let G and H be groups where G acts on H by group homomorphisms. As in the 
text, we let H x G denote the set H x G with the binary operation given by (6.9). 
(a) Prove that H = G is a group. 
(b) Prove that the map H x G — G defined by (4, g) +> g is an onto group homomorphism 
with kernel H x {e}. 
(c) Prove that 4+ (h,e) defines an isomorphism H ~ H x {e} (where the group structure 
on H x {e} comes from H x G). 


Exercise 9. Explain how (6.6) and (6.10) relate to the last paragraph of the discussion of 
semidirect products in the Mathematical Notes. 


Exercise 10. Let p > 3 be prime, and let F, » F, be the semidirect product described in the 
Mathematical Notes. 

(a) Show that F, x F> is not Abelian. 

(b) Show that the product group F, x F° is Abelian. 

(c) Show that F, x F, is an extension of F, by F,. 

Since we already know that F, » F, is an extension of F, by Fp, we see that (a) and (b) give 
nonisomorphic extensions. 


Exercise 11. The goal of this exercise is to show that the group G of permutations (6.11) is 
metacyclic in the sense that G has a normal subgroup H such that H and G/H are cyclic. Show 
that this follows from G ~ AGL(1,F,) together with (6.6) and Proposition A.5.3. 
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Exercise 12. Let p be prime. Generalize part (a) of Exercise 6 by showing that every element 
of S, of order p is a p-cycle. 


Exercise 13. Let L be the splitting field of 2x° — 10x+-5 over Q. Prove that Gal(L/Q) ~ Ss 


Exercise 14. let L = Q(¢,, 1/2). Prove that L = Q(1/2,¢,1/2), ie., the splitting field of 
— 2 over Q can be generated by two of its roots. Chapter 14 will show that this follows from 
Galois’s criterion for an irreducible polynomial of prime degree to be solvable by radicals. 


Exercise 15, Let L = Q(¢,, ¥/2). The c) uw of Gal(L/Q) given in the text enables one 


to construct some elements of Gal(L/Q(¢,)). Use these automorphisms and Proposition 6.3.7 
to prove that x” — 2 is irreducible over Se. 


6.5 ABELIAN EQUATIONS (OPTIONAL) 


In this section we will discuss the following theorem of Abel: 


If the roots of an equation of arbitrary degree are related among themselves in 
such a way that all the roots can be expressed rationally by means of one of 
them, which we denote by x; if in addition whenever one denotes by @x, 41x two 
other arbitrary roots, one has 


60x = 0,6x, 


then the equation to which they belong will always be solvable algebraically. 


(See [Abel, p. 479].) Our goal is to interpret this theorem in terms of Galois theory. 

We begin by translating Abel’s theorem into modern terminology. First observe 
that Abel talks about an equation f = 0 rather than a polynomial f. This is typical 
for the early nineteenth century. We will assume that f is a nonconstant polynomial 
whose coefficients lie in a field F. Since we prefer x to be a variable, we will replace 
Abel’s x with a. So a@ will be a root of f in some extension field. 

Now let a) = a, @2,...,Q, be the roots of f in a splitting field L. Then, when Abel 
says that the roots can be “expressed rationally” in terms of a, he means that there 
are rational functions 0; with coefficients in F such that a; = 6;(a). In Exercise 1 
you will show that this is equivalent to 


(6.12) L=F(a). 
Here is an example of this from Chapter 4. 


Example 6.5.1 If we let a = /2 + V3, then (4.4) implies that f = x4 — 10x27 +1 
factors as 


(x—a)(x+a)(x—- 10a + 03)(x+ 100-7). 
If we set 0: (x) = x, 02(x) = —x, 03(x) = 10x —x°, and 64(x) = —10x +, then the 
roots of f are 0;(a), 02(@), 03(@), and @4(a). <p> 


The rational functions @; in Abel’s theorem give functions from L to L that usually 
fail to be automorphisms. For instance, the function 92 (x) = —x in the above example 
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does not preserve multiplication, since 92(ab) = —ab ditfers from 62(a)62(b) = 
(—a)(—b) = ab whenever ab # 0. 
In our notation, the displayed equation in the quote from Abel becomes 


(6.13) 6,(8;(a)) =8;(8:(a)), 1Si,j<n. 


In Exercise 2 you will show that the rational functions of Example 6.5.1 satisfy this 
condition. Following Kronecker and Jordan, we call f = 0 an Abelian equation if f 
is a nonconstant polynomial with a root a satisfying (6.12) and (6.13). 

The conclusion of Abel’s theorem states that f is “solvable algebraically.” In 
modern terms this means “solvable by radicals,” which will be defined carefully in 
Chapter 8. Thus Abel’s theorem can be restated as follows. 


Theorem 6.5.2 In characteristic 0, every Abelian equation is solvable by radicals. 


The hypothesis about characteristic 0 is not in Abel’s original statement but is 
needed since the theory of solvability developed in Chapter 8 only applies to fields 
of characteristic 0. Abel always worked in characteristic 0. 

Theorem 6.5.2 is a consequence of the following two theorems. Recall from 
Section 6.1 that the Galois group of f € F[x] is Gal(L/F), where L is a splitting field 
of f over L. 


Theorem 6.5.3 The Galois group of an Abelian equation is an Abelian group. 


Theorem 6.5.4 In characteristic 0, a polynomial with Abelian Galois group is solv- 
able by radicals. 


We will prove Theorem 6.5.4 in Chapter 8. The proof will follow from Galois’s 
criterion for solvability by radicals together with the fact that every finite Abelian 
group is solvable. (All of these terms will be defined in Chapter 8.) 

We now prove Theorem 6.5.3. 


Proof of Theorem 6.5.3: If the Abelian equation is f = 0, where f € F[x], then f 
has a root a such that L = F(q) is the splitting field of f. In particular, we have 
rational functions 6;(x) € F(x), 1 <i<n, such that the 6;(q) are the roots of f. Now 
let o,7 € Gal(L/F). In Exercise 3 you will prove the following: 

© o(a) = 6;(a) and r(a) = 6;(a) for some i and j. 

© ot =T< in Gal(L/F) if and only if o(7(a)) = 7(a(a)) in L. 

© o(7(a)) = 6;(6(a)) and 7(o(a)) = 6;(8;(a)). 

Since f = 0 is Abelian, the theorem follows easily from these bullets. . 


In the 1880s, Weber applied the term “Abelian” to commutative groups because 
of this theorem. 
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Historical Notes 


The story of Abelian equations begins with Gauss, who showed in 1801 that 
x" —1 =O is solvable by radicals. We will study Gauss’s work in Chapter 9, and in 
Chapter 10 we will explore the surprising geometric consequences of his results. In 
Exercise 4 you will show that x” — 1 = 0 is an Abelian equation over Q. 

In his 1829 paper Mémoire sur une classe particuliére d’équations résolubles 
algébriquement [Abel, Vol. I, pp. 478-507], Abel states the theorem quoted at the 
beginning of the section and goes on to say 

After having explained this theory [the solvability of Abelian equations] in 
general, I will apply it to circular and elliptic functions. 


(See [Abel, Vol. I, p. 479].) In this passage, “circular functions” refer to the work of 
Gauss just mentioned, and “elliptic functions” refer to Abel’s deep results on elliptic 
functions and complex multiplication. We will discuss a special case of this involving 
the lemniscate in Chapter 15. Abel died at age 26 before he could publish the full 
details of his work. 

Kronecker introduced the term “Abelian equation” in 1853 in the special case 
when the Galois group was cyclic. The general sense of the term, as defined here, 
is due to Jordan in 1870. Kronecker’s interest in Abelian equations is related to his 
amazing conjecture that the roots of an Abelian equation over Q can be expressed 
rationally in terms of a root of unity. This was proved in 1886 by Weber and is now 
called the Kronecker-Weber Theorem. The modern version of this theorem is stated 
as follows. 


Theorem 6.5.5 Suppose that Q C L is a finite extension such that LC C. Then the 
following conditions are equivalent: 

(a) Q C Lis normal and Gal(L/Q) is Abelian. 

(b) There is a root of unity ¢, = e™/" such that LC Q(S,,)- rT 


In the next chapter, you will prove (b) = (a) in Exercise 14 of Section 7.3, and 
a proof of (a) => (b) can be found in [3, pp. 125—129]. The proof of (a) = (b) uses 
ideas from algebraic number theory and is beyond the scope of this book. 

The early history of group theory and Galois theory are closely related—after all, 
Galois was the person who introduced the term “group” into mathematics. So it is 
not surprising that notions like Abelian equations from Galois theory influenced the 
terminology of group theory. We will see many more examples of this phenomenon 
in the next chapter. 


Exercises for Section 6.5 


Exercise 1. Assume that f € F[x| is nonconstant and has roots a1 = 0, 02,...,Q, in a splitting 
field L. Prove that L = F(a) if and only if there are rational functions 6; € F(x) such that 
a; = 6;(a). Can we assume that the 9; are polynomials? 


Exercise 2. Show that the equation x* — 10x” + 1 =0 discussed in Example 6.5.1 is Abelian. 


Exercise 3. Complete the proof of Theorem 6.5.3. 
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Exercise 4. Show that x” — 1 = 0 is an Abelian equation over Q. 


Exercise 5. Let f be the minimal polynomial of /2+ V2 over Q. Show that f = 0 is an 
Abelian equation. 


Exercise 6. In this exercise, you will prove a partial converse to Theorem 6.5.3. Suppose that 
a finite extension F C L is normal and separable and has an Abelian Galois group. 
(a) Explain why F Cc Lhas a primitive element. 
(b) By part (a), we can find a € L such that L = F(a). Let f be the minimal polynomial of 
a. Prove that f = 0 is an Abelian equation over F. 
See Theorem 8.5.8 for the precise relation between Abelian equations and Abelian groups. 


Exercise 7. Show that the implication (a) => (b) of Theorem 6.5.5 is equivalent to Kronecker’s 
assertion that the roots of an Abelian equation over Q can be expressed rationally in terms of 
a root of unity. 
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CHAPTER 7 


THE GALOIS CORRESPONDENCE 


This chapter will draw on the work we did in Chapters 4, 5, and 6 to state and prove 
the main theorems of Galois theory. We will also give some applications. 


7.1. GALOIS EXTENSIONS 


In Section 6.2 we learned that splitting fields of separable polynomials are especially 
nice from the point of view of Galois theory. The main goal of this section is to 
characterize such extensions in terms of normality and separability. We will also 
apply this theory to study separable extensions. 


A. Splitting Fields of Separable Polynomials. Before stating our main result, 
we introduce the idea of a fixed field. Suppose that we have a finite extension F C L 
with Galois group Gal(L/F). Given a subgroup H C Gal(L/F), we call 


Ly = {a EL| o(a) =a forall o € A} 


the fixed field of H. This terminology is justified by Exercise 1, where you will show 
that Ly is a subfield of L containing F. 


Here is one of the important theorems of Galois theory. 


Galois Theory, Second Edition. By David A. Cox 147 
Copyright © 2012 John Wiley & Sons, Inc. 


148 THE GALOIS CORRESPONDENCE 


Theorem 7.1.1 Let F C L be a finite extension. Then the following are equivalent: 
(a) Lis the splitting field of a separable polynomial in F |x]. 

(b) F is the fixed field of Gal(L/F) acting on L. 

(c) F C Lis anormal separable extension. 


Proof: (a) => (b): Let K be the fixed field of Gal(L/F). By Exercise 1 we have 
F C K CL, and the goal is to show K = F. For this purpose, note that since L is the 
splitting field of a separable polynomial f € F[x] over F, the same is true over the 
larger field K, since f also lies in K[x]. By Theorem 6.2.1 it follows that 


[L:F]=|Gal(L/F)| and [L:K]=|Gal(L/K)]. 


Next observe that Gal(L/K) C Gal(L/F), since if an automorphism of Lis the identity 
on K, then it is also the identity on the smaller field F. The reverse inclusion also 
holds, since every o € Gal(L/F) is the identity on K, for K is the fixed field of 
Gal(L/F). It follows that Gal(L/K) = Gal(L/F). Combining this with the above 
equations, we see that 


[L: F] = [L:K]. 


Since [L: F] = [L: K][K: F], we have [K: F| = 1, and K = F follows. 

(b) = (c): Now suppose that F is the fixed field of Gal(L/F) and leta € L. We 
will find the minimal polynomial of @ over F using a construction due to Lagrange. 
Let a) = a, @2,..., a, be the distinct elements of L obtained by applying the elements 
of Gal(L/F) to a. Then consider the polynomial 


r 


(7.1) h(x) = [][(«- ai) € Lx. 


i=] 


We claim that / € F [x] and that A is irreducible over F. 

We first show that 0 € Gal(L/F) permutes the a;. By definition, aj = T(a@) for 
some 7 € Gal(L/F). Then o(a;) = o(7(a@)) = (oT) (a), which is a; for some j. 
Thus o maps {q,...,a,} to itself, which gives a permutation, since o is one-to-one. 

Since o permutes the aj, it also permutes the factors x — a; of A. This shows that 
the coefficients of / are fixed by Gal(L/F) and hence lie in the fixed field, which is F 
by assumption. Hence h € F |x], as claimed. 

Next let g € F |x] be the irreducible factor of 4 that vanishes at a. Then Proposi- 
tion 6.1.4 shows that o(a@) is also a root of g for all o € Gal(L/F). Since the a; are 
the distinct elements of L obtained in this way, (7.1) shows that hg. It follows that 
is irreducible over F,, since g is an irreducible factor of h. 

We conclude that A € F|x] is the minimal polynomial of a over F, since h is 
irreducible over F and has a as a root. The above formula for h also shows that h is 
separable and splits completely over L. Hence: 


e Normality: If f € F[x| is irreducible and has aroot a € L, then f is the polynomial 


h defined in (7.1) (up to a constant factor). Thus f splits completely over L, which 
proves normality. 
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e Separability: If a € L, then its minimal polynomial is the polynomial h. Then a 
is separable over F because h is, and separability follows. 

This shows that F ¢ Lis normal and separable, as claimed. 
(c) => (a): Finally, suppose that F C L is normal and separable. We can write 


L= F(qj,...,Q,), where the minimal polynomial p; of a; over F is separable. Let 
i,--+5@r be the distinct elements of the set {p1,..., pn}, and set 
f=41---Gr- 


By Lemma 5.3.4, f is separable (the lemma applies because the qg; are monic—do 
you see why?). Furthermore, the proof of Theorem 5.2.4 shows that L is the splitting 
field of f over F (you will check this in Exercise 2). Thus L is the splitting field over 
F of a separable polynomial in F[x], as claimed. a 


In light of this theorem, we make the following definition. 


Definition 7.1.2 An extension F C L is called a Galois extension if it is a finite 
extension satisfying any of the equivalent conditions of Theorem 7.1.1. 


To see how Definition 7.1.2 works, consider the following extensions. 

e The extension Q ¢ Q(V2, V3) is Galois, since Q(/2, V3) is the splitting field of 
(x? — 2)(x? — 3) over Q. This uses part (a) of Theorem 7.1.1. 

e The extension Q ¢ Q(1/2) is not Galois, since x3 — 2 is irreducible over Q, has 
a root in Q(1/2), but does not split completely over Q(«/2). This uses part (c) of 
Theorem 7.1.1. 


Here is one case where being a Galois extension is automatic. 


Proposition 7.1.3 Suppose that F C L is a Galois extension and that we have an 
intermediate field F C K C L. Then K Cc Lis a Galois extension. 


Proof: We will use part (a) of Theorem 7.1.1. If F C L is Galois, then L is the 
splitting field of a separable polynomial in f € F [x]. By regarding f as an element 
of K|x], it follows immediately that the same is true over the larger field K. (This is 
the argument used in the proof of (a) => (b) from Theorem 7.1.1.) a 


While the proof of Proposition 7.1.3 seems easy, notice that it is much less obvious 
if we think in terms of parts (b) and (c) of Theorem 7.1.1. 

The reader should also note that in the situation of Proposition 7.1.3, F C K need 
not be Galois. Here is a simple example to illustrate this. 


Example 7.1.4 By Example 4.1.10, Q ¢ Q(i, 2) is the splitting field of x* — 2 and 
hence is a Galois extension. Consider the intermediate fields Q(i) and Q(/2). Then 
Q C Q(i) is Galois (it is the splitting field of x? + 1), while Q C Q(V2) is not (x* —2 
is the minimal polynomial of 2 but doesn’t split completely). <> 


In Section 7.2 we will learn how to recognize exactly when F C K is Galois in the 
situation of Proposition 7.1.3. 
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Definition 7.1.2 and Theorem 6.2.1 imply that |Gal(L/F)| = [L: F] whenever 
F C Lis Galois. For an arbitrary finite extension, the relation between the order of 
the Galois group and the degree of the extension can be described as follows. 


Theorem 7.1.5 Let F C L be a finite extension. Then: 

(a) |Gal(L/F)| divides [L: F]. 

(b) |Gal(L/F)| < (LF, 

(c) F C Lis a Galois extension if and only if |Gal(L/F)| = [L: F]. 


Proof: To prove part (a), let K be the fixed field of Gal(L/F). Then F CK CL, 
and the proof of (a) = (b) from Theorem 7.1.1 implies that Gal(L/K) = Gal(L/F) 
(be sure you understand why). Thus KX is the fixed field of Gal(L/K), so that K CL 
is a Galois extension by Theorem 7.1.1. Hence 


[L: F] =[L:K][K : F] = |Gal(L/K)|[K : F] = |Gal(L/F)|[K :F], 


where the first equality uses Theorem 4.3.8, the second uses Theorem 6.2.1 (K CL 
is Galois), and the third uses Gal(L/K) = Gal(L/F). Hence the order of Gal(L/F) 
divides [L: F, as claimed. 

Part (b) is an immediate consequence of part (a). As for part (c), note that one 
direction follows from Theorem 6.2.1. For the converse, suppose that F C Lis a 
finite extension with |Gal(L/F)| = [L: F], and let K be the fixed field of Gal(L/F). 
If we can prove that K = F, then Theorem 7.1.1 will imply that F C L is a Galois 
extension. 

To show that K = F, first observe that the proof of part (a) given above implies 
that K C Lis a Galois extension and that Gal(L/K) = Gal(L/F). Then 


[L: F] = |Gal(L/F)| = |Gal(L/K)| = [L:K\, 


where the first equality is by assumption, the second uses Gal(L/K) = Gal(L/F), 
and the third holds because K C L is a Galois extension. We conclude that K = F 
just as in the proof of (a) = (b) from Theorem 7.1.1. a 


Theorem 7.1.1 gave three ways to characterize Galois extensions, and part (c) of 
Theorem 7.1.5 gives a fourth. Putting these together, we see that a finite extension 
F C Lis Galois if and only if any of the following equivalent conditions is satisfied: 
e Lis the splitting field of a separable polynomial in F|x]. 
e F is the fixed field of Gal(L/F) acting on L. 
e F C Lis anormal separable extension. 
e |Gal(L/F)| = [L: F]. 


B. Finite Separable Extensions. The theory of Galois extensions implies the 
following characterization of finite separable extensions. 


Proposition 7.1.6 Let F C L be a finite extension. Then L is separable over F if and 
only if L= F(a,,...,Qn), where each a; is separable over F. 
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Proof: First assume that F C L is separable. Since it is also finite, Theorem 4.4.3 
implies that L has the desired form. For the converse, let L = F(aj,...,Q,), where 
each a; is separable over F. Our strategy will be to embed L in a larger field that is 
separable over F. 

Let p; be the minimal polynomial of a; over F, and let qi,...,q, be the distinct 
elements of the set {pi,...,p,}. Then Lemma 5.3.4 implies that f = g1---q, is 
separable, since each q; is. Let M be the splitting field of f, regarded as a polynomial 
in L[x]. Thus M = L(\,..., 8m), where 31,..., 8m are the roots of f. 

We claim that M = F(8),..., Gm). To see why this is true, note that we have the 
obvious inclusion 


(7.2) F(B,,..-, Bm) C L(By,---,Bm) = M. 
However, the roots 8),..., Gm include a1,...,@,,, So that 
L=F(qq,...,Q,) C F(B1,---, Bm): 
Thus F((;,...,8m) contains both L and (,...,8mn, which gives the inclusion 


M =L(6j,..-, 8m) C F(B1,---,8m)- 


Combining this with (7.2), we see that M = F(3,,..., 8m), as claimed. 

This shows that M is the splitting field over F of the separable polynomial f. Then 
F C Mis Galois and hence separable by Theorem 7.1.1. Since L C M, every element 
of L is separable over F,, so that F C L is separable. a 


Proposition 7.1.6 has some nice consequences. For example, if F C Landa,8€L 
are separable over F,, then so are a+ (, a8, and a/G (assuming 8 #0). This in 
turn implies that in characteristic p, any finite extension can be written as a separable 
extension followed by a purely inseparable one. You will be asked to prove these 
assertions in Exercises 3 and 4. 


C. Galois Closures. The proof of Proposition 7.1.6 shows how to embed a finite 
separable extension F’ C L into a larger Galois extension. This leads to the idea of 
Galois closure, which roughly speaking is the smallest extension of L that is Galois 
over F. More precisely, we have the following result. 


Proposition 7.1.7 Let F C L be a finite separable extension. Then there is an 

extension L C M such that: 

(a) M is Galois over F, i.e., F C M is a Galois extension. 

(b) Given any other extension L C M!' such that M' is Galois over F, there is a field 
homomorphism ip : M — M' that is the identity on L. 


Proof: Since F C Lis finite and separable, we can write L = F(aj,...,Q,), where 
a; is separable over F. Following the proof of Proposition 7.1.6, we get an extension 
LCM such that M is a splitting field over L of the separable polynomial f = q ---q,, 
where q1,...,q, are the distinct elements of {p,,...,p,} and p; is the minimal 
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polynomial of a; over F. As in the proof of Proposition 7.1.6, we see that F C M is 
a Galois extension. 

To show that L C M satisfies part (b) of the proposition, let L C M’ be an extension 
where M’ is Galois over F. By Theorem7.1.1, F C M’ is normal, so that each p; splits 
completely over M’. It follows that f splits completely in M’. Let M” C M’ be the 
subfield obtained by adjoining the roots of f to F. Furthermore, since a; € L C M’ 
is aroot of f, we have L C M”. Thus we can regard M” as a splitting field of f over 
L. By the uniqueness of splitting fields (Corollary 5.1.7), there is an isomorphism 
y:M — M" that is the identity on L. Since M” Cc M’, we can regard y as a field 
homomorphism y : M —> M’. This completes the proof of the proposition. a 


Be sure you understand why part (b) of the proposition implies that L C M can be 
thought of as the smallest extension of L that is Galois over F. The field constructed 
in Proposition 7.1.7 is called the Galois closure of L over F. In Exercise 5 you will 
show that the Galois closure of F C L is unique up to an isomorphism that is the 
identity on L. 

Related to the idea of Galois closure is the normal closure of a finite extension 
F CL. Roughly speaking, this is the smallest extension of L that is normal over F. 
The theory of normal closures is worked out in Exercises 6 and 7. 


Historical Notes 


Of the criteria for F C L to be a Galois extension given in Theorem 7.1.1, the 
most elegant is the one involving the fixed field of Gal(L/F). For Galois, this was 
his Proposition I, which was the first of his main results [Galois, p. 51}: 


PROPOSITION I 


THEOREM. For a given equation, let a,b,c,... be the m roots. There is 
always a group of permutations on the letters a, b,c,... that enjoys the following 
property: 

1° that every function of the roots that is invariant** under the substitutions 
of the group, is rationally known; 


2° conversely, that every function of the roots that is rationally determined, 
is invariant under these substitutions*. 


For Galois, “rationally known” and “rationally determined” refer to elements of a 
field F containing the coefficients of the given equation. Adjoining the roots of this 
equation gives the splitting field L = F(a,b,c,...). Furthermore, Galois assumes that 
the given polynomial “does not have equal roots.” Hence the polynomial is separable, 
so that L is a Galois extension of F. Since every element of L is a “function of the 
roots,” parts 1° and 2° of Galois’s Proposition I say that F is the fixed field of the 
Galois group acting on L. Thus we recover part (b) of Theorem 7.1.1. 

In Galois’s manuscript, Proposition I includes two notes, marked with ** and * 
above. We will explain these notes when we discuss Galois’s work in Chapter 12. 

We should also mention that Galois knew the formula (7.1) for the minimal 
polynomial given in the proof of Theorem 7.1.1 (see [Galois, p. 85). In Chapter 12, 
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we will see that this formula is a generalization of the resolvent polynomial defined 
by Lagrange in 1770. 


Exercises for Section 7.1 


Exercise 1. Given a finite extension F C L and a subgroup H Cc Gal(L/F), prove that 
Ly = {a €L| o(a) =a forall o € H} is a subfield of L containing F. 


Exercise 2. In the proof of (c) > (a) in Theorem 7.1.1, give the details of how the proof of 
Theorem 5.2.4 shows that L is the splitting field of f over F. 


Exercise 3. Suppose that F C L and that a, 6 € L are separable over F. Prove that a+ 2, af, 
and a/{ (assuming  # 0) are also separable over F. 


Exercise 4. Let F C L be a finite extension, and assume F has characteristic p. Then consider 
the set K = {a € L| ais separable over F }. 
(a) Use Proposition 7.1.6 to show that K is a subfield of L containing F. Thus F C K isa 
separable extension. 
(b) Use part (c) of Theorem 5.3.15 to show that K C Lis purely inseparable. 


Exercise 5. Prove that the Galois closure of a finite separable extension F C Lis unique up to 
an isomorphism that is the identity on L. 


Exercise 6. In analogy with the Galois closure of a finite separable extension, every finite 
extension F C L has a normal closure, which is essentially the smallest extension of L that is 
normal over F.. State and prove the analog of Proposition 7.1.7 for normal closures. 


Exercise 7. Prove that the normal closure of a finite extension F C L is unique up to an 
isomorphism that is the identity on L. 


Exercise 8. Let 4 be the polynomial (7.1) used in the proof of (b) > (c) from Theorem 7.1.1. 
Show that there is an integer m such that 


Il (x—o(a@)) =h". 


o€Gal(L/F) 


Exercise 9. For each of the following extensions, say whether it is a Galois extension. Be sure 
to say which of our four criteria (the three parts of Theorem 7.1.1 and part (c) of Theorem 7.1.5) 
you are using. 

(a) Qc Q(V2, V2). 

(b) QC Q(a, 8), a, 8 distinct roots of x? +x? +2x+1. 

(c) F(t?) C F,(¢), ta variable. 

(d) C(t+17') C C(t), ta variable. 

(e) C(t") c C(t), ta variable, n a positive integer. 
The ideas underlying the extensions given in parts (d) and (e) will be discussed in Section 7.5. 


Exercise 10. Prove that Q(w, v2) is the Galois closure of Q(/2) over Q. 
Exercise 11. Construct the Galois closure of Q ¢ Q(2). 


Exercise 12. Let F C L be an extension of degree 2, where F has characteristic 4 2. 
(a) Show that L = F(a), where a is a root of an irreducible polynomial of degree 2. 
(b) Show that the minimal polynomial of @ over F is separable. 
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(c) Conclude that F C L is a Galois extension with Gal(L/F) ~ Z/2Z. 

(d) By completing the square, show that there is 8 € L such that L = F() and 6” € F. 
For § as in part (d), let a= 8? € F. Then we can write 8 = \/a. This shows that if F has 
characteristic # 2, then every degree 2 extension of F is obtained by taking a square root. 


7.2. NORMAL SUBGROUPS AND NORMAL EXTENSIONS 


In Chapter 5 we introduced normal extensions, and in abstract algebra you learned 
about normal subgroups. This section will explain why it is no accident that these 
concepts have the same name. 


A. Conjugate Fields. In high school algebra one calls 2 — 3 the conjugate of 
2+ V3. This terminology is used for subfields as follows. 


Definition 7.2.1 Suppose that we have finite extensions F C K C L. Then, for an 
automorphism o € Gal(L/F), we call 


oK = {o(a)|a€ K} 
a conjugate field of K. 


We should write o(K) instead of oK, but we prefer the latter because it is less 
cumbersome. Note that oK is a subfield of L, since a is a field isomorphism. 
We can compute the degree of a conjugate field as follows. 


Lemma 7.2.2 Let F C K CL and o € Gal(L/F) be as in Definition 7.2.1. Then 
F CoK C Land |K:F]=[oK: F]. 


Proof: The inclusion F C cK is obvious, since F C K and a is the identity on 
F. Also, o(aa) = o(a)o(a@) = ao(a) when a € F anda e€ K. It follows that 
o|, :K — oK is linear over F in the sense of linear algebra. Hence o|, is an 
isomorphism of vector spaces over F, so that [K : F] = dimpK = dimpoK = [cK : F] 
by the definition of degree given in Section 4.3. . 


Here is an example of conjugate fields. 


Example 7.2.3 Consider the extension Q C Q(w, W2), where w = e?7'/3. Then we 
have the following intermediate fields: 


UT. 
NL 


(7.3) 
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Recall that ¢ € Gal(Q(w, /2)/Q) is determined uniquely by 
(7.4) o(w) € {wyw?} and o(W2) € {V2,uV2,w?V72}. 


In Exercise 2 of Section 6.2 we showed that all possible combinations of o(w) and 
o(\/2) actually occur. In Exercise 1 below you will check the following: 

© Q(W/2) has conjugate fields Q(v/2), Q(wv2), and Q(w?V/2). 

e Q(w) equals all of its conjugates. 

Later in the section we will explain the second bullet using Galois theory. <p> 


We next relate intermediate fields to subgroups of the Galois group. 


Lemma 7.2.4 Suppose that we have finite extensions F C K C L. Then: 
(a) Gal(L/K) is a subgroup of Gal(L/F). 
(b) Ifo € Gal(L/F), then Gal(L/oK) = oGal(L/K)o~' in Gal(L/F). 


Proof: Eacho € Gal(L/K) is an automorphism of L that is the identity on K. Then 
a € Gal(L/F) follows from F Cc K; hence Gal(L/K) C Gal(L/F). Since both are 
groups under composition, we conclude that Gal(L/K) is a subgroup. 

To prove part (b), let -y € aGal(L/K)o—! and 8 € cK. Then y = ora! for some 
7 € Gal(L/K), and 8 = (qa) for some a € K. Thus 


¥(8) =a707'(a(a)) 
=a(r(a ) 
= (a) = 8, 
where the third equality follows because 7 is the identity on K. Hence 7 is the identity 


on oK, which implies that cGal(L/K)o~! Cc Gal(L/oK). The opposite inclusion is 
equally straightforward (see Exercise 2), and the lemma follows. 7 


In group theory, a conjugate of a subgroup H C G is a subgroup of the form 
gHg' for some g € G. Thus part (b) of Lemma 7.2.4 tells us that conjugate fields 
correspond to conjugate subgroups. 


B. Normal Subgroups. The first main theorem of this section explains how 
normal subgroups relate to normal extensions. 


Theorem 7.2.5 Suppose that we have fields F C K C L, where F C L is a Galois 
extension. Then the following conditions are equivalent: 

(a) K =oK forall o € Gal(L/F), i.e., K equals all of its conjugates. 

(b) Gal(L/K) is a normal subgroup of Gal(L/F). 

(c) F C K is a Galois extension. 

(d) F CK is anormal extension. 


Proof: We first show that (a) and (b) are equivalent. Proving (a) => (b) is especially 
easy, for K = aK and Lemma 7.2.4 imply that 


Gal(L/K) = Gal(L/oK) = oGal(L/K)o™!. 
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Thus Gal(L/K) is normal in Gal(L/F) since this holds for all o € Gal(L/F). To 
prove (b) = (a), first note that if Gal(L/K) is normal and o € Gal(L/F), then using 
Lemma 7.2.4 a second time implies that 


Gal(L/K) = oGal(L/K)o—! = Gal(L/oK). 
However, K C Land oK C L are Galois extensions by Proposition 7.1.3. Hence 
K = fixed field of Gal(L/K) = fixed field of Gal(L/oK) = oK, 


where the first and third equalities use Theorem 7.1.1. 

We next observe that (c) and (d) are equivalent. The implication (c) > (d) is 
trivial, since every Galois extension is normal and separable. For (d) => (c), note that 
since F C Lis Galois, it is also separable, and then any intermediate field F C K CL 
is also separable over F. If in addition K is normal over F,, then it is normal and 
separable, and hence Galois. 

Finally, we prove that (a) <= (d). For (a) > (d), let f € F [x] be irreducible over F 
with a root a € K. We need to show that f splits completely over K. In the proof of 
Theorem 7.1.1 we showed that up to a constant, f is the polynomial 

h(x) = [ [(@— ai) 
i=l 
from (7.1), where @,; = @,Q@2,...,a@, are the distinct elements of L obtained by 
applying the elements of Gal(L/F) to a. Since a € K, each q; lies in a conjugate 
field of K. Using (a), we conclude that a; € K for all i, so that h and hence f split 
completely over K. 

It remains to show that (d) = (a). Take a € K and o € Gal(L/F), and let p be 
the minimal polynomial of a over F. By Proposition 6.1.4, o(a) is also a root of 
p. Since F C K is normal, p splits completely over K, which implies that o(a) € K. 
Thus oK C K, and then equality follows by Lemma 7.2.2. 2 


Here is an example of how this theorem works. 


Example 7.2.6 Consider Q C L = Q(w, 72) studied in (7.3). By the discussion 
following (7.4), there are automorphisms o,7 € Gal(L/Q) such that 


(7.5) o(w) =w, o(V¥2) =wV2 and 7(w) =w’, r(V2) = V2. 


Label the roots of x3 — 2 as a, = V2, a2 = wV2, and a3 = w* 2, and consider the 
isomorphism Gal(L/Q) ~ S3 given by the action of the automorphisms on the roots 
Q1,Q@2,0@3. Then it is easy to see that 


o++(123), 7+ (23). 


Since these permutations generate S3, it follows that 0 and 7 generate Gal(L/Q). 
Now consider the fields in the diagram (7.3). Each such field K gives a subgroup 
Gal(L/K) Cc Gal(L/Q). Furthermore, in Exercise 3 you will show that 


(7.6) K, C Ky CL => Gal(L/K;) > Gal(L/K2). 
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In other words, larger fields correspond to smaller Galois groups. Then we claim 
that for the fields K of (7.3), the map K +> Gal(L/K) gives the following diagram of 
subgroups of Gal(L/Q): 


{e} 


INN. 
NL 


Gal(L/Q) 


(7.7) 


In this diagram, (c) is the subgroup generated by o. Thus (c) = {e,0,07}, since o 
has order 3. Similarly, (7), (o?7), (a7) are subgroups of order 2. 
To see how (7.3) gives (7.7), consider the case of Q(w). We know that 


_ Jay. _ [Q@,v2):Q) _ 6 _ 
(Gal(L/Q(w))} = [L: Q()] = [Qw,V2):Qw)] = ~Ecg, =7=3 
Furthermore, (7.5) shows that o is the identity on Q(w), since o(w) =w. Thus 
o € Gal(L/Q(w)), and it follows easily that Gal(L/Q(w)) = (0). In Exercise 4 
you will give similar arguments for the other fields in (7.3) to verify that applying 

K + Gal(L/K) to (7.3) gives (7.7). 

We can relate (7.3) and (7.7) to Lemma 7.2.4 and Theorem 7.2.5 as follows. First 
consider Q(w). This is the splitting field of x? +x+ 1 over Q, so that Q C Q(w) is 
Galois. By Theorem 7.2.5, this implies: 

© Q(w) coincides with its conjugates in L, as we saw in Example 7.2.3. 

e Gal(L/Q(w)) = (o) is normal in Gal(L/Q). 

We can also go the other way. Under the isomorphism Gal(L/Q) ~ S3, (0) maps to 
the normal subgroup A3. Thus (co) is normal in Gal(L/Q), so that Q C Q(w) is a 
Galois extension by Theorem 7.2.5. 

We can do a similar analysis for Q(/2). Example 7.2.3 shows that this field has 
three conjugates in L. Hence 


(7.8) Gal(L/Q(W2)) = (7) 


is not normal in Gal(L/Q), by Theorem 7.2.5. We can relate the conjugate fields 
of Q(/2) to the conjugates of (r) as follows. By Exercise 1, the conjugate fields 
of Q(/2) are itself, 7Q(/2), and o?Q(/2). Then Lemma 7.2.4 implies that the 
Galois groups of L over these fields are the conjugate subgroups 


(rT), o(t)o", a (r\a~?. 


One easily checks that these are the subgroups (7), (07), (oT) from (7.7). <P 
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In group theory, normal subgroups are important because they lead to quotient 
groups. Recall that if N is normal in G, then left cosets of N coincide with right 
cosets, and the set G/N consisting of all cosets of N in G becomes a group under 
multiplication, the quotient group. Theorem 7.2.5 shows that normal subgroups arise 
naturally in Galois theory. 

When Gal(L/K) Cc Gal(L/F) is normal, the second main theorem of this section 
explains how to interpret the quotient group. 


Theorem 7.2.7 Suppose that we have extension fields F C K C L, where F C K and 
F C Lare Galois. Then Gal(L/K) is a normal subgroup of Gal(L/F), and there is a 
natural isomorphism of groups 


Gal(L/F)/Gal(L/K) ~ Gal(K/F). 


Proof: If F C K is Galois, then Gal(L/K) is normal in Gal(L/F) by Theorem 7.2.5. 
It remains to relate Gal(L/F)/Gal(L/K) to Gal(K/F). 

For a fixed o € Gal(L/F), the restriction of o to K gives the isomorphism o|, : 
K ~oK. But Theorem 7.2.5 tells us that cK = K, since F C K is Galois. It follows 
that o|, is an automorphism of K. Since a is the identity on F, the same is true for 
|x (do you see why?), so that a|, € Gal(K/F). 

It follows that o + o|, defines a function 


® : Gal(L/F) —> Gal(K/F). 
Furthermore, in Exercise 5 you will verify that for 0,7 € Gal(L/F), we have 
(7.9) OT |e = (FT) = Tq OTe = AKT 


where the first composition takes place in Gal(L/F) and the second in Gal(K/F). 
This shows that ® is a group homomorphism. 
The kernel of © is easy to determine, for if o € Gal(L/F), then 


ao €Ker(®) © o|, =1x < ais the identityon K < oa € Gal(L/K). 


Thus Ker(®) = Gal(L/K), and then the Fundamental Theorem of Group Homomor- 
phisms implies that ® induces an isomorphism 


Gal(L/F) /Gal(L/K) ~ Im(®) Cc Gal(K/F). 


The final step is to show that Im(®) = Gal(K/F). The key point is that since all 
of the extensions involved are Galois extensions, their degrees equal the order of the 
corresponding Galois groups. Thus 


[Im(®)| = |Gal(L/F)/Gal(L/K)| 
_ |Gal(L/F)| _ [L:F] 


= TGal(L/R)] 7 Lk] 7 IK 'FI= |GallK/ FI 


This shows that Im(®) = Gal(K/F) and completes the proof of the theorem. rT 
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Here is a simple example of Theorem 7.2.7. 


Example 7.2.8 Consider Q Cc Q(w) C L= Q(w, 72). Since Q C Q(w) is Galois 
and Gal(L/Q(w)) = (c), where a is as in (7.5), the theorem implies that 


Gal(Q(w)/Q) ~ Gal(L/Q)/(o) ~ 83/A3 ~ Z/2Z. 
Note that if 7 is as in (7.5), then Gal(Q(w)/Q) = {lew)s Tle): <P 


Mathematical Notes 


There are two ideas in this section to comment on. 


s The Galois Correspondence. In Section 7.3 we will see that (7.3) and (7.7) give 
an example of the Galois correspondence. It is easy to check that (7.7) gives all 
subgroups of Gal(L/Q) for L = Q(w, /2) (see Exercise 6). Then Theorem 7.3.2 will 
tell us that (7.3) gives all fields between Q and Q(w, V2). This is not obvious—while 
the subfields in (7.3) are easy to find, how do we know that they are all the subfields? 
This is a good illustration of the power of the Galois correspondence. 

We should also mention that (7.6) is also part of the Galois correspondence. The 
idea behind (7.6) is that as the field K gets larger, the Galois group Gal(L/K) gets 
smaller. This explains why the arrows in (7.3) go up while those in (7.7) go down. 


« Conjugate Fields. If F C K is not Galois, then K will have a certain number of 
conjugate fields in L (assuming F C K C L and L is Galois over F). We claim that 
the number of such conjugate fields is related to the normalizer of a subgroup. 

To see this, we first analyze when a conjugate equals the given field. Suppose that 
F CK CLE, where L is Galois over F, and let o € Gal(L/F). In Exercise 7 you will 
show that 


(7.10) K =oK <= Gal(L/K) = oGal(L/K)o™' in Gal(L/F). 
In group theory, the normalizer of a subgroup H C G is the set 
No(H) = {g €G| gHg™' =H}. 


One can show that Ng(H) is a subgroup of G, H is a normal subgroup of Nc(#7), 
and Ng(#) is the largest subgroup of G in which H is normal. (You should do 
Exercise 8 if you’re not familiar with normalizers.) From (7.10), it follows that for 
a € Gal(L/F), we have 


K=oK <=> og is in the normalizer of Gal(L/K) in Gal(L/F). 


Using this and standard facts about group actions, one can prove that if F C L is 
Galois and K is an intermediate field, then the number of conjugate fields of K in L 
is given by the index 
|Gal(L/F)| 

IN|’ 
where N is the normalizer of Gal(L/K) in Gal(L/F) (see Exercise 9 for the details). 


[Gal(L/F): N] = 
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Historical Notes 


In a letter written the night before his fatal duel, Galois describes the concept of 
normal] subgroup as follows [Galois, pp. 173-175]: 

In other words, when a group G contains another group H, the group G can be 
divided into groups that are obtained by performing the same substitution on the 
permutations of H, so that G = H+HS+HS’ +--- and it can also be divided 
into groups with the same substitutions so thatG =H +TH +T’H+---. These 
two decompositions do not ordinarily coincide. When they do coincide, the 
decomposition is said to be proper. 


In modern terms, equality of the decompositions 
G=H+HS+HS'+---=H+TH+T'H+--- 


implies that the left cosets of H coincide with the right cosets, which is equivalent 
to the usual definition of normal subgroup. However, we will see in Chapter 12 that 
Galois’s “groups” are not quite what you might think. 

Galois was also aware of Theorem 7.2.5, though again his terminology takes some 
explanation. The details of what Galois knew can be found in [Edwards, pp. 47-66], 
[3, pp. 80-84], and [8]. 

The second main theorem of this section, Theorem 7.2.7, concerns quotient groups. 
Quotient groups weren’t defined until much later in the nineteenth century, though 
hints can be found in the examples worked out in Galois’s memoir (see [3, p. 82]). 
When discussing Galois’s work in 1852, Betti made some further progress toward a 
definition of quotient group, and by the 1880s quotient groups were well established. 
For us, the key point is that both normality and quotient groups first arose in the 
context of Galois theory. 


Exercises for Section 7.2 


Exercise 1. In the diagram (7.3), verify the following. 


(a) Q(W2) has conjugate fields Q(W/2), Q(wv/2), and Q(w?/2). 
(b) Q(w) equals all of its conjugates. 


Exercise 2. Complete the proof of Lemma 7.2.4 by showing that 
Gal(L/aK) C oGal(L/K)o™' 
Exercise 3. Prove (7.6). 


Exercise 4. Verify that applying K ++ Gal(L/K) to (7.3) gives (7.7). Don’t forget to include 
the extreme cases K = Q and K = L. 


Exercise 5. Prove (7.9) in the proof of Theorem 7.2.7. 


Exercise 6. For the extension Q C L = Q(w, 72), we listed some subgroups of Gal(L/Q) in 
diagram (7.7). Prove that this gives all subgroups of Gal(L/Q). 


Exercise 7. Suppose that F C K C L, where L is Galois over F, and let o € Gal(L/F). Show 
that 


K=o0K <> Gal(L/K) = oGal(L/K)o' in Gal(L/F). 
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Exercise 8. Let H be a subgroup of a group G, and let N¢(H) = {g € G| gHg~' = H} be the 
normalizer of H in G, as defined in the Mathematical Notes. 
(a) Prove that Nc(H) is a subgroup of G containing H. 
(b) Prove that H is normal in Nc(H). 
(c) Let N be a subgroup of G containing H. Prove that H is normal in N if and only if 
N CNc(H). Do you see why this shows that N¢(H) is the largest subgroup of G in which 
H is normal? 


(d) Prove that H is normal in G if and only if N¢(H) =G. 


Exercise 9. Let F C L be Galois, and suppose that F C K C Lis an intermediate field. The 
goal of this exercise is to show that the number of conjugates of K in L is 


|Gal(L/F)| 


[Gal(L/F):N] = ND 


where N is the normalizer of Gal(L/K) in Gal(L/F). More precisely, suppose that the distinct 
conjugates of K are 


K = 0\K,02K,...,0,K, 
where o) = e. Then we need to show that r = [Gal(L/F) : N]. 
(a) Show that Gal(L/F) acts on the set of conjugates {01 K,o2K,...,0,K}. 
(b) Show that the isotropy subgroup of K is the normalizer subgroup N. 


(c) Explain how r = [Gal(L/F) : N] follows from the Fundamental Theorem of Group Actions 
(Theorem A.4.9 from Appendix A). 


Exercise 10. In (7.5), explain why 7 is complex conjugation restricted to Q(w, V2). 


Exercise 11. Consider the extension Q C L= Q(v2, V3). 
(a) Show that Gal(L/Q) = {e,o,7,07}, where 


o(Vv2)=V2, (V3) = -V3, 
1(V2) =—v2,  1(v3) = V3. 


(b) Find all subgroups of Gal(L/Q), and use this to draw a picture similar to (7.7). 
(c) For each subgroup of part (b), determine the corresponding subfield of L and use this to 
draw a picture similar to (7.3). 
(d) Explain why all of the subgroups in part (b) are normal. What does this imply about the 
subfields in part (c)? 
In the next section, we will see that the Galois correspondence implies that the subfields you 
found in part (c) give all subfields of L. 


7.3. THE FUNDAMENTAL THEOREM OF GALOIS THEORY 


We can now state the main result of this chapter, which describes precisely the relation 
between subgroups and subfields. Recall that if we are given a finite extension F C L 
and a subgroup H Cc Gal(L/F), then we have the fixed field 

Ly ={a€L|o(a) =a forall o € A}. 


In Exercise 1 of Section 7.1 you showed that Ly is a subfield of L containing F. The 
first part of the Fundamental Theorem of Galois Theory goes as follows. 
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Theorem 7.3.1 Let F C L be a Galois extension. 
(a) For an intermediate field F C K C L, its Galois group Gal(L/K) C Gal(L/F) 
has fixed field 
Loai(t/K) = K. 
Furthermore, |Gal(L/K)| = [L: K] and [Gal(L/F):Gal(L/K)] = [K: F]. 
(b) For a subgroup H C Gal(L/F), its fixed field F C Ly C L has Galois group 


Gal(L/Ly) =H. 
Furthermore, [L: Ly] = |\H| and (Ly : F| = [Gal(L/F): H]. 


Proof: Part (a) follows easily from earlier results. We are assuming that F C L is 
Galois, so that K C L is also Galois by Proposition 7.1.3. Then K = Lgaz/x) follows 
from Theorem 7.1.1 and the definition of Galois extension. 

Since K C L and F C L are both Galois, we have |Gal(L/K)| = [L: K] and 
|Gal(L/F )| = [L: F] by Theorem 6.2.1. Using these equalities and the Tower Theo- 
rem (Theorem 4.3.8), we obtain 


(Gal(L/F):Ga\L/K)] = (Gag) = egy RF 


This completes the proof of part (a). 
To prove part (b), let H be a subgroup of Gal(L/F). This gives F C Ly C L, and 
since every o € H is the identity on Ly, we have 


(7.11) H CGal(L/Ly). 


To prove that equality occurs, we will give a classic proof using the Theorem of the 
Primitive Element. Observe that Ly C L is a finite separable extension (since F C L 
is), so that L = Ly(c) for some a € L by Corollary 5.4.2. Then consider 


h(x) = [] @—-o(a)). 


o€H 


By standard arguments, the coefficients of h are fixed by H (be sure you can prove 
this carefully). Thus h € Ly [x] satisfies h(a) = 0. It follows that if p € Ly|x] is the 
minimal polynomial of a over Ly, then p|h. This implies that 


(7.12) \H| = deg(h) > deg(p) = [Ly (a) : Ly] = [L: Lu], 


where the second equality follows because p is the minimal polynomial of a over 
Ly. Combining (7.11) and (7.12), we obtain 


[L:Lu] <|H| < |Gal(L/Lw)|.- 


However, Proposition 7.1.3 implies that Ly C L is Galois, so that we also have 
|Gal(L/Ly)| = [L: Lx]. Then the above inequalities easily imply that 


|H| = |Gal(L/L)|, 
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and H = Gal(L/Ly) follows immediately. Similar to part (a), we conclude that 
[Gal(L/F) : H] = [Ly : F}. We leave the details as Exercise 1. a 


Here is the second part of the Fundamenta! Theorem of Galois Theory. 


Theorem 7.3.2 Let F Cc L be a Galois extension. Then the maps between interme- 
diate fields F C K C Land subgroups H C Gal\(L/F) given by 


K+ Gal(L/K), 
He Ly 


reverse inclusions and are inverses of each other. Furthermore, if a subfield K 
corresponds to a subgroup H under these maps, then K is Galois over F if and only 
if H is normal in Ga\(L/F), and when this happens, there is a natural isomorphism 


Gal(L/F)/H ~ Gal(K/F). 
Proof: Composing the maps one way gives 
K +> Gal(L/K) + Leayt/K) = K 
by part (a) of Theorem 7.3.1, and going the other way gives 
H+ Ly} Gal(L/Ly) =H 


by part (b) of the theorem. This proves that the maps K +> Gal(L/K) and H +> Ly 
are inverses of each other. The map K +> Gal(L/K) is inclusion-reversing by (7.6), 
and H; C Hy => Ly, > Lu, follows from the definition of fixed field. 

The fina! assertions of the theorem follow from Theorems 7.2.5 and 7.2.7. . 


We next give two examples of the Galois correspondence. 


Example 7.3.3 Consider the extension Q C L = Q(w, V2), w = e?"/3. Recall from 
(7.7) that Gal(L/Q) ~ S3 has subgroups 


Gal(L/Q) 


Here, 0,7 € Gal(L/Q) are as in (7.5), and Exercise 6 of Section 7.2 shows that these 
are all subgroups of Gal(L/Q). 
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By (7.3), the corresponding fixed fields are 


QW, v2 


IW 


Qw) Q(72) Quv2) Qu*V72) 
NL 


The key point is that according to Theorem 7.3.2, these are all subfields of L = 
Qiu, V2) containing Q. Furthermore, note that the discussion of conjugate exten- 


sions, normal subgroups, etc., given in Example 7.2.6 verifies the fine details of 
Theorem 7.3.2. <P 


Here is a slightly more complicated example. 


Example 7.3.4 We get a similar picture for the extension Q C L = Q(i,W2). In 
Exercise 2 you will describe Gal(L/Q) as follows. 


e Gal(L/Q) is generated by elements o,7 such that 
o(i) =i, o(V2) =iV2 and r(i) = -i, r(W2) = V2. 


We also have o(c) = 4 and o(7) = 
e Gal(L/Q) ~ Dg, where Dg is the dihedral group of order 8. 


We next work out the correspondence between subfields of L and subgroups of 
Gal(L/Q). 
In Exercise 3 you will show that all subgroups of Gal(L/Q) are given by 


I~, 


(r) (077) oT) 


7 |S 
om NTZINTA 
/ 


(o*,T OT) 


NI 


Gal(L/Q) 
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and that the corresponding fixed fields are given by 


v2) 


LN 
f 


(7.14) NN 


QV2) QiV2) Qév2) Q(1+i)V2) Q(1-AVv2) 
Q 


Again, Theorem 7.3.2 implies that this gives all subfields of L = Q(i, 2) containing 
Q. Exercise 3 will work out the details of the Galois correspondence. <> 


Finally, let us give an interesting application of the Galois correspondence. 


Proposition 7.3.5 Let F Cc L be a finite separable extension. Then there are only 
finitely many intermediate fields F C K C L. 


Proof: By Proposition 7.1.7 there is an extension L C M such that F Cc M is Galois. 
Then Theorem 7.3.2 implies that subfields of M containing F correspond to subgroups 
of Gal(M/F). Since Gal(M/F) is finite, it has only finitely many subgroups, so that 
there are only finitely many subfields of M containing F. Since F C LC M, it follows 
in particular that there are only finitely many intermediate fields between F and L. = 


In contrast, there are finite purely inseparable extensions that have infinitely many 
intermediate fields. Here is a classic example. 


Example 7.3.6 Let k be a field of characteristic p, and consider the extension 
F=k(t,u) CL, 


where L is the splitting field of (x? —t)(x? —u) € F[x]. This extension was studied 
in Example 5.4.4, where we showed that it has no primitive element. Furthermore, 
Exercise 4 of Section 5.4 showed that F C L is purely inseparable and L = F(a, 8), 
where a? =t and 8? =u. 

In Exercise 5 of Section 5.4 you proved that the intermediate fields 


(7.15) FCF(atAg) cL 


are all distinct as \ ranges over the distinct elements of F. Since F is infinite, we see 
that there are infinitely many intermediate fields. 

In Exercise 4, you will show that Gal(L/F) = {1,}. This means that in particular, 
Gal(L/F) has only one subgroup, namely {e}, yet F C L has the infinitely many 
intermediate fields given by (7.15). << 
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This example shows that the Galois correspondence can break down spectacularly 
for purely inseparable splitting fields. 


Exercises for Section 7.3 


Exercise 1. Complete the proof of Theorem 7.3.1 by showing that [Gal(L/F): H] = [Lu : F] 
for all subgroups H C Gal(L/F). 


Exercise 2. Consider Q C L = Q(i, W2). 
(a) Show that there are o,7 € Gal(L/Q) such that 


o(i) =i, (V2) =i¥2 and r(i) = —i, 7(V2) = V2. 


(b) Prove that o(a) = 4, o(r) = 2 and that 7 is complex conjugation restricted to L. 
(c) Prove that o and 7 generate Gal(L/Q). 
(d) Show that Gal(L/Q) ~ Dg, where Dg is the dihedral group of order 8. 


Exercise 3. Let L = Q(i, /2) and o,7 € Gal(L/Q) be as in Exercise 2 and Example 7.3.4. 

(a) Show that all subgroups of Gal(L/Q) are given by (7.13). 

(b) Show that the corresponding fixed fields are given by (7.14). 

(c) Determine which subgroups in part (a) are normal in Gal(L/Q), and for those that are 
normal, construct a polynomial whose splitting field is the corresponding fixed field. 

(d) For the subfields in part (b) that are not Galois over Q, find all of their conjugate fields. 
Also describe the conjugates of their corresponding groups. 


Exercise 4. Prove that the extension F C L of Example 7.3.6 has Gal(L/F) = {11}. 


Exercise 5. Consider the extension F = C(t*) C L= C(t), where ¢ is a variable. 
(a) Show that L is the splitting field of x* —¢4 € F{x] over F. 
(b) Show that x* — r4 is irreducible over F. 
(c) Show that Gal(L/F) ~ Z/4Z. 
(d) Similar to what you did in Exercise 3, determine all subgroups of Gal(L/F) and the 
corresponding intermediate fields between F and L. 
We will say more about this type of extension in Section 7.5. 


Exercise 6. This exercise will work out the Galois correspondence for the splitting field L of 
x‘ —4x? +2 over Q. In Exercise 6 of Section 5.1 you showed that L = Q(V2+ V2) and that 
Gal(L/Q) ~ Z/4Z. Now, similar to Example 7.3.4, determine all subgroups of Gal(L/Q) and 
the corresponding intermediate fields of Q C L. 


Exercise 7. Let ¢, = e’”'/”, and consider the extension Q C L = Q(¢,). 
(a) Show that L is the splitting field of f = x° +2° + x4 +23 +4°+x+41 over Q and that f 
is the minimal polynomial of ¢,. 
(b) Let (Z/7Z)* be the group of nonzero congruence classes modulo 7 under multiplication. 
By Exercise 4 of Section 6.2 there is a group isomorphism Gal(L/Q) ~ (Z/7Z)*. Let 
H c (Z/7Z)* be the subgroup generated by the congruence class of —1. Prove that 
Q(¢,+ ¢;'') is the fixed field of the subgroup of Gal(L/Q) corresponding to H. 


Exercise 8. Let « = ¢,+(¢; |, where ¢, = e?""/7, 


(a) Show that the minimal polynomial of a over Q is tx? -2r-1, 
(b) Use Exercise 7 to show that the splitting field of x+x*? —2x-1 over Q is a Galois 
extension of degree 3 with Galois group isomorphic to Z/3Z. 
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Exercise 9. Let F be a field of characteristic different from 2, and let F C L be a finite 
extension. Prove that the following are equivalent: 

(a) Lis a Galois extension of F with Gal(L/F) ~ Z/2Z x Z/2Z. 

(b) L is the splitting field of a polynomial of the form (x? — a)(x? — b), where a,b € F but 


Va, Vb, Vab do not lie in F. 


Exercise 10. Suppose that a, 8 € C are algebraic of degree 2 over Q (i.e., they are both roots 
of irreducible quadratic polynomials in Q[x]). Prove that the following are equivalent: 

(a) Q(a) = Q(A). 

(b) a=a+b8 for some a,b € Q,b #0. 

(c) a+ Bis the root of a quadratic polynomial in Q[x]. 

Exercise 11. Let F C L be a Galois extension, and let F C K CL be an intermediate 
field. Then let N be the normalizer (as defined in the Mathematical Notes to Section 7.2) of 
Gal(L/K) C Gal(L/F). Prove that the fixed field Ly is the smallest subfield of K such that K 
is Galois over the subfield. 

Exercise 12. Let H be a subgroup of a group G, and let N = NrecsHe'. 

(a) Show that N is a normal subgroup of G. 

(b) Show that N is the largest normal subgroup of G contained in H. 


Exercise 13. Let F C L be a Galois extension, and let F C K C L be an intermediate field. If 
we apply the construction of Exercise 12 to Gal(L/K) C Gal(L/F), then we obtain a normal 
subgroup N C Gal(L/F). Prove that the fixed field Ly is the Galois closure of K. 


Exercise 14. Prove the implication (b) => (a) of Theorem 6.5.5. 


Exercise 15. Let p be prime. Consider the extension Q C L= Q(6, 7/2) discussed in 
Section 6.4. There, we showed that Gal(L/Q) ~ AGL(1, F,). The group AGL(1, F,) has two 
subgroups defined as follows: 


T={y,|b6€F,} and D={y7,,|a€F5}, 


where ¥, ,(4) = au+ b, u € Fy. Let T’ and D’ be the corresponding subgroups of Gal(L/Q). 
(a) Show that the fixed field of T’ is Q(¢,). 
(b) What is the fixed field of D'? What are the conjugates of this fixed field? 


7.4 FIRST APPLICATIONS 


This section is devoted to three applications of the Galois correspondence. 


A. The Discriminant. The discriminant A({) € F of a nonconstant monic poly- 
nomial f € F |x] was defined in Section 2.4. There, we showed that if f has degree 
n>2and f = (x—a)---(x— a) ina splitting field L of f, then 


A(f) = [[(ei- 4) eF. 
i<j 


In Section 5.3, we saw that f is separable if and only if A(f) 40. We define 


VAF) = [](ei-:) EL. 
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Note that while A(f) is uniquely determined by f, the above square root depends on 
how the roots are labeled. 

If f € F[x] is separable, then by Section 6.3 the action of the Galois group on the 
roots Q,...,Q@, of f gives a one-to-one group homomorphism 


Gal(L/F) —> Sp. 


In S, we also have the alternating group A, C S,. Our first result shows that ,/ A(/) 
controls the relation between A, and Gal(L/F). 


Theorem 7.4.1 Let f and F C L be as above, and assume that the characteristic of 
F is different from 2. 


(a) Ifo € Gal(L/F) corresponds to tT € S,, then 
o(VA(F)) = sgn(r) VAS). 
(b) The image of Gal(L/F) lies in A, if and only if \/A(f) € F (i.e. A(f) is the 


square of an element of F). 


Proof: The result is trivial if n = 1. Hence we may assume that n > 2. Recall from 
Proposition 2.4.1 that VA = Tic (ti —x;) € F[t1,..-,%n] has the property that 
(7.16) 7-VA =sgn(t) VA 
for all 7 € S,. This gives the identity 

[[G@-w —Xr(;)) = sgn(T) [[: — xj) 

i<j i<j 


in F[x),...,Xn]. Since the evaluation map F[x1,...,x,] > L sending x; to a; is a ring 
homomorphism (Theorem 2.1.2), it follows that 


[rw — Q,(j)) = sgn(T) [It — aj) = sgn(r) VA(f). 
i<j i<j 
However, we also have o(a;) = @,(;), which implies that 
[ler - erm) =o(VA). 
i<j 


This completes the proof of part (a). For part (b), observe that F Cc L is Galois, so 
that F is the fixed field of Gal(L/F). Combining this with (a), we obtain 


A(f) EF <=> o( VA(f)) = VACS) for all o € Gal(L/F) 
<=> sgn(r) /A(f) = /A(f) for all o € Gal(L/F), 


where 7 is the image of o under the map Gal(L/F) —> S,. Since A(f) # 0 and F 
has characteristic 4 2, the last condition is equivalent to sgn(7) = 1 for all 7 coming 
from Gal(L/F). Then we are done, since sgn(7) = 1 if and only if 7 € A,. 7 
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This result allows us to compute the Galois group of an irreducible cubic. 


Proposition 7.4.2 Let f € F[x] be a monic irreducible separable cubic, where F has 
characteristic # 2. If Lis the splitting field of f over F, then 


Z/3Z, if A(f) is a square in F , 
53, otherwise. 


Gal(L/F) ~ 


Proof: Exercise 6 of Section 6.2 implies that |Gal(L/F)| is a multiple of 3, since 
f is irreducible and separable. We also have the one-to-one map Gal(L/F) > S3. 
Since the only subgroups of $3 of order divisible by 3 are $3 and A3, the proposition 
follows easily from Theorem 7.4.1. We leave the details as Exercise 1. 2 


Here is a simple example of this proposition. 


Example 7.4.3 Consider f = x? + x? —2x—1 € Ql|x]. It is easy to see that f is 
irreducible over Q and hence is separable, since we are in characteristic 0. Using the 
method discussed in Section 5.3 one computes that 


A(f) =49=7?. 
By Proposition 7.4.2 the Galois group of f over Q is cyclic of order 3. <P 


In Exercise 2 you will compute the Galois groups of some other cubics, and in 
Chapter 13 we will compute the Galois groups of quartics and quintics. 


B. The Universal Extension. Consider the universal extension in degree n, 
(7.17) K=F(o1,...,0n) CL=F(x,...,%n), 
where as usual o;,...,0, are the elementary symmetric polynomials. Recall from 


Section 6.4 that this is the splitting field of the universal polynomial of degree n, 


fax" —ox" +--+ (-1)"o, = [[@-»). 


i=l 


Theorem 6.4.1 implies that K Cc L is Galois with Galois group Gal(L/K) ~ Sy. 
Furthermore, if we identify Gal(L/K) with S,, then o € S, becomes the automorphism 
of L = F(x,,...,X,) that permutes the x; according to oc. 

Then the Fundamental Theorem of Galois Theory implies the following facts 
about symmetric functions. As above, set 


VA= [Te — xj). 


Theorem 7.4.4 Let R € F(x,,...,%n) be a rational function. 
(a) R is invariant under S,, if and only if R € F(o\,...,0n). 
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(b) Assume that F has characteristic #4 2. Then R is invariant under A,, if and only 
if there are A,B € F(o\,...,0n) such that 


R=A+BVA. 


Proof: Since (7.17) is a Galois extension, Theorem 7.3.1 implies that K is the fixed 
field of Gal(L/K) = S,, acting on L. This proves part (a). 

To prove part (b), let M = Ly, be the fixed field of A, acting on L. Since A, has 
index 2 in Gal(L/K) = S,, Theorem 7.3.1 implies that K C M is an extension of 
degree 2. However, (7.16) shows that VA € M, so that we have 


KC K(VA) CM. 
By the Tower Theorem, it follows that 
2 =(M:K|=[M:K(VA)|[K(VA): K]. 


But (7.16) also shows that VA ¢ K (since F and hence K have characteristic # 2). 
We conclude that K(VA) = M. Finally, since VA is a primitive element of the 
degree 2 extension K C M, Proposition 4.3.4 implies that 


M ={A+BVA|A,BE K}. 
This completes the proof of part (b). 2 


In Chapter 2 we proved part (a) of Theorem 7.4.4 by first considering the case 
when R is a polynomial in x;,...,x, (Theorem 2.2.2) and then doing the case when 
R is a rational function (Exercise 8 of Section 2.2). The proof given above is much 
shorter. This illustrates nicely the power of Galois theory. 

On the other hand, if R is a polynomial in x;,...,X,, then part (a) of Theorem 7.4.4 
does not assert that R is a polynomial in the o;—the theorem only tells us that R is in 
the field F(o),...,0,). The point is that Galois theory deals with fields rather than 
rings. In Exercise 3 you will study what happens in part (b) of Theorem 7.4.4 when 
R is a polynomial in x),...,X,. 


C. The Inverse Galois Problem. The Galois group Gal(L/F) of a finite extension 
F C Lisa finite group. But what finite groups can arise in this way? We will discuss 
two aspects of this question. 

We first note that a finite group G of order n is isomorphic to a subgroup of S,. 
The easiest way to see this is to label the elements of G as g1,...,8n. Then the group 
operation on G can be represented by its Cayley table, where the entry in row i and 
column j is gig;. For example, the Cayley table of $3 is 


( 
(123) | (123) (132) e  ( 
(7.18) (132) | (132) e (123) ( 


(12) | (12) (23)—(13),—se~SSs« (12) (123) 
(13) | (13) (12) (23),-—«(123)—sse~Ss(132) 
(23) | (23) (13) (12) (132) (123) e 
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In 1854 Cayley observed that every row of the Cayley table of G is a permutation 
of the elements of G (do Exercise 4 if you didn’t prove this in your abstract algebra 
course). It follows that for the ith row, there is a permutation 0; € S,, (where n = |G}) 
such that 


(7.19) 8i8j = 8o;(j)- 


In Exercise 5 you will compute the six elements of Ss given by the rows of (7.18), 
and in Exercise 6 you will show that in general, the map G — S, given by g; > aj 
is a one-to-one group homomorphism. It follows that G is isomorphic to a subgroup 
of S,,. Combining this with the Galois correspondence for the universal extension in 
degree n gives the following nice result. 


Theorem 7.4.5 Given a finite group G, there is a Galois extension whose Galois 
group is isomorphic to G. 


Proof: Let G be a finite group of order 7, and let F be an arbitrary field. We know 
that the universal extension in degreee n, 


K=F(o,...,0n,) CL=F(x1,..-,Xn), 


is a Galois extension with Galois group Gal(L/K) ~ Sy,. 

Since G is isomorphic to a subgroup of S,, it follows that G is also isomorphic 
to a subgroup H c Gal(L/K). Then the fixed field of H is an intermediate field 
K C Ly CL, and the Fundamental Theorem of Galois Theory tells us that Ly C L is 
a Galois extension with Galois group 


Gal(L/Ly) =H ~G. 
This shows that Ly C Lis the desired extension. a 


However, this is not the end of the story, for in the extension Ly C L constructed 
in Theorem 7.4.5, the smaller field Ly depends on the group. In explicit examples, 
one is often interested in Galois groups of polynomials over Q. Thus the question is: 
Which finite groups can occur as the Galois group of a finite extension of Q? This is 
called the inverse Galois problem for Q. 

There has been a lot of work on this problem, starting with Hilbert, who used his 
irreducibility theorem (mentioned in the Mathematical Notes to Section 6.4) to show 
that for every n > 1, both S, and A, can occur as Galois groups of Galois extensions 
of Q. In Section 6.4 we also gave the example of x” — x — 1, whose Galois group 
over Q is S, for n > 2. Another example is the polynomial 


Pa(x) =1L4+xthx?t baie + dnt 


obtained by truncating the power series for e*. In 1930, Schur proved that the Galois 
group of p, over Q is A, when n = 0 mod 4 and S, otherwise (see [Chebotarev, 
p. 398] for references and further examples). In the case of a prime p, the paper [7] 
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uses elementary methods to construct a polynomial of degree p whose Galois group 
over Q is S,. The book Inverse Galois Theory [5] discusses some of the powerful 
methods used to study this unsolved problem in general. See also [1]. 


Historical Notes 


The universal extension in degree n, 
K =Fi{o4,...,0n) CL=F(x,...,%); 


plays acentral (though somewhat implicit) role in Lagrange’s 1770 treatise on solving 
equations by radicals (see the Historical Notes to Section 1.2). In his paper, Lagrange 
proves many interesting results, including a theorem that (in modern terminology) 
says that if y € L, then the intermediate field K C K(y) C L satisfies 


K(%) = Leat(t/K(y)): 


For us, this is part of the Galois correspondence, yet Lagrange proved this result 60 
years before Galois. We will discuss Lagrange’s work in more detail in Chapter 12. 


Exercises for Section 7.4 


Exercise 1. Give a detailed proof of Proposition 7.4.2. 


Exercise 2. Compute the Galois groups of the following cubic polynomials: 
(a) x? —4x+2 over Q. 

(b) x° — 4x +2 over Q(V37). 

(c) x? —3x+1 over Q. 

(d) x? —t over C(t), ¢ a variable. 

(e) x° —1 over Q(t), t a variable. 


Exercise 3. This exercise will study part.(b) of Theorem 7.4.4 when f is a polynomial 

in x1,...,%» that is invariant under An. The theorem implies that f =A+BV/A for some 

A,B € F(o1,...,0n). You will prove that A and B are polynomials in the o;. Recall that F is a 

field of characteristic # 2. 

(a) Show that f + (12)-f = 2A. 

(b) In part (a), the left-hand side is a polynomial while the right-hand side is a symmetric 
rational function. Use Theorem 2.2.2 to conclude that A is a polynomial in the oj. 

(c) Let P denote the product of f — A and (12): (f —A). Show that P= —B’A. 

(d) Let B=u/v, where u,v € F[oi,...,0n] are relatively prime (recall that F[oi,...,0n] is 
a UFD). In Exercise 8 of Section 2.4 you showed that A is irreducible in F[o1,...,0n]. 


Use this and the equation v P = —u’ A to show that v must be constant. This will prove 
that B € F[o,,..., on]. 


Exercise 4. Let G be a group of order n, and fix g € G. 
(a) Show that the map G > G defined by ht> gh is one-to-one and onto. 


(b) Explain why part (a) implies that each row of the Cayley table of G is a permutation of 
the elements of G. 
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(c) Write G = {g1,...,gn}, and fix g; € G. Use part (a) to show the existence of oj € Sn 
satisfying gig; = 8o,(j) as in (7.19). 

Exercise 5. Label the elements of S3 as g1 = e, g2 = (123), g3 = (132), g4 = (12), gs = (13), 
and g6 = (23). Write down the six permutations a; € Se defined by the rows of the Cayley 
table (7.18). 

Exercise 6. In the situation of Exercise 4, let G = {g1,..., gn}, and assume that 9:2; = gx. Let 
0i,0j,0% € Sp be the corresponding permutations determined by (7.19). 

(a) Prove that o;0; = ox. 

(b) Prove that the map G > S, defined by g; +> o; is a one-to-one group homomorphism. 
Exercise 7. Let f and F C L satisfy the hypothesis of Proposition 7.4.2, and assume 
that /A(f) ¢ F. Prove that Gal(L/F(\/A(f))) = Z/3Z and that f is irreducible over 
F(VA(f)). 


7.5 AUTOMORPHISMS AND GEOMETRY (OPTIONAL) 


This optional section will explore some unexpected connections between geometry 
and Galois theory. 


A. Groups of Automorphisms. The theory developed in Chapters 6 and 7 begins 
with an extension F C L and then considers its Galois group Gal{L/F). We will now 
change our point of view and instead begin with a field L and a finite group G of 
automorphisms of L. 

Here are two simple examples. 


Example 7.5.1 Let L = Q(¢,), and consider the automorphism o of L that maps ¢, 
to G3. Since 2? = 1 mod 7, we see that G = (a) = {1,,0,07} ~ Z/3Z is a group of 
automorphisms of L. <p> 


Example 7.5.2 Let L = C(t), where t is a variable. It is easy to see that t > 1/t 
induces an automorphism o of L. This gives the group G = (o) = {1,,0} ~ Z/2Z 
of automorphisms of L. <p> 
Given a finite group G of automorphisms of a field L, we get the fixed field 
Ig CL. 


Furthermore, the definition of Lg easily implies that G C Gal(L/Lg). However, much 
more is true, as we will now prove. 


Theorem 7.5.3 Let G be a finite group of automorphisms of a field L. Then: 
(a) [L: Lg] = |GI. 

(b) Lg C Lis a Galois extension. 

(c) Gal(L/Lg) = G. 


Proof: Letn=|G|. We first claim that 
(7.20) [L: Lg] <n. 
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If (7.20) is false, then we can find a,...,Qn41 € L that are linearly independent 
over Lg. Also let the elements of G be a; = 1z,02,.-..,0,. Then, given unknowns 
X1,-+-,Xn41, consider the equations 


X10} (a1) e+ ng O1(On41) = 


bai o2(a1) +e -+Xn4102(On+41) = 
(7.21) 


x On(a1) + bX On(On41) = 0. 


This is a system of n homogeneous equations in n+ 1 unknowns with coefficients in 
the field L. 
Since the number of unknowns exceeds the number of equations, (7.21) must have 


a nontrivial solution (x1,...,%n+1) = (B1,---,@n41) in L"+'. Among all nontrivial 
solutions in L”+!, pick one that has the fewest nonzero 6;’s. Relabeling, we can 
write this solution as (61,...,8,,0,...,0), where 3),...,8, are nonzero. Then being 


a solution of (7.21) means that 
Bioi(ay) +---+B,-o;(a,) =9, i=1,...,n. 


Observe that r > 1 since 8; #0 and a; #0. Furthermore, we may divide by 6) and 
relabel 82,...,G, to obtain 


(7.22) 0;(a1) + Broj(a2) +---+6,o;)(ar) =0, i=1,...,n. 
Since o; = 1,, setting i= 1 in (7.22) gives 
01 + Bra2+---+ Ba, =0. 
Hence £2,...,8, cannot all lie in Lg since a;,...,a, are linearly independent over 
Lg by assumption. Relabeling, we may assume that 6, ¢ Lg, so that o(8,) £ 8, for 
some o € G. Now apply this o to (7.22) to obtain 
a0;(a1) +0(B2) ca;(a2)+---+0(8,)oo;(a,)=0, i=1,...,n. 


Since G = {oj,...,0,} is a group under composition, the product oo; gives all 
elements of G as we vary o;. Thus we obtain 


(7.23) oi(a1) +o( 62) o;(a2)+---+0(8-)o;(a,)=0, i=1,...,n. 


Now multiply the equations of (7.22) by o(8,) and the equations of (7.23) by 8, 
and subtract. This choice of multipliers causes the coefficients of o;(a,) to cancel. 
Hence we are left with the equations 


(o(B,) — Br) oi(a1) + (0(8,)B2 — Bro(B2)) o:(a2) 
+++ (0(8,)B,-1 — B-o(8,-1)) oi(ar-1) =0, G=1,...,n. 
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Thus the (n + 1)-tuple 


(o(8,) _ By, o(6,)B2 _ B,0( 62), tee 0(Br) 8-1 _ 6,o(Br—-1),0, see ,0) 


is a solution of (7.21). It has at most r— 1 nonzero entries and is nontrivial, since 
o(8,) # 8,. This contradicts our choice of r and completes the proof of (7.20). 
It follows that Lg C L is a finite extension. Furthermore, we have 


Le C Leat/ig) © Le; 


where the first inclusion follows because elements of Gal(L/Lc) are the identity 
on Lg, and the second follows from G C Gal(L/Lg). We conclude that Lg is the 
fixed field of Gal(L/Lg), and then Theorem 7.1.1 implies that Lg C L is normal and 
separable and hence is a Galois extension. This proves part (b) of the theorem. 

Since Lg C Lis Galois, Theorem 7.1.5 implies that [L: Lg] = |Gal(L/Lg)|. Com- 
bining this with (7.20), we have 


|Gal(L/L¢)| = [L:Le] <n =|G| < |Gal(L/Lo)|, 

since G Cc Gal(L/Lg). From here, parts (a) and (c) follow easily. . 
Here is an example of how to use this theorem. 

Example 7.5.4 In Example 7.5.2 we considered the group of automorphisms of 


L= C(t) given by G = (c), where o(t) = 1/t. Since +17! is obviously fixed by G, 
we have 


(7.24) C(t+t7!) Cle CL=C(t). 


However, f is a root of (x—1)(x—17!) =x? —(t407!)x4-1 © C(tte7!)[x}. Fur- 
thermore, the inclusions 


C(t) c C(r+4t7!)(t) c C(t) 


show that C(t+17!)(t) = C(t). Thus C(r) is obtained by adjoining ¢ to C(t +17'). 
Since t is a root of a quadratic equation with coefficients in C(t + t~'), we have 


(C(t): C(t+07')] <2. 


Theorem 7.5.3 implies that (L: Lg] = |G| = 2. Using (7.24) and the Tower Theorem, 
it follows easily that Lg = C(t+17'). <p> 


B. Function Fields in One Variable. Example 7.5.4 is a Galois extension 
constructed from the field C(t) of rational functions in the variable ¢ with coefficients 
in C. More generally, the function field F(t), where F is an arbitrary field, has some 
interesting subfields as follows. 


Proposition 7.5.5 Assume that a € F(t) is a rational function not in F, and write 
a =a(t)/b(t), where a(t), b(t) € F(t] are relatively prime. Then: 
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(a) a is transcendental over F. 
(b) The polynomial a(x) — ab(x) € F(a) |x] is irreducible over F(a). 
(c) F(a) C F(t) is a finite extension of degree 

[F(t): F(a)] = max(deg(a),deg(b)). 


Proof: (fa is algebraic over F, then q@ satisfies an equation 
a" +aja"!+---+a, =0, 


where n > | and aj,...,a, € F. Substituting a = a(t)/b(t) into the above equation 
and multiplying by b(t)” gives 


(7.25) a(t)” + aja(t)""'b(t) +---+a,b(t)” =0 
in the polynomial ring F [rt]. This implies that 
a(t)” = b(t): (—aya(t)"~! -—---— a,b(t)"~'). 


Since a(t) and b(t) are relatively prime, b(t) must be constant, say bp € F. Then 
substituting b(t) = bo into (7.25) gives 


a(t)" + aja(t)""'bo +--+ +a,b2 =0 


in F(t]. This implies that a(t) is also constant (can you explain why?), say ao € F. 
Then a = a(t)/b(t) = ap/bo € F, which is a contradiction. Part (a) follows. 
For parts (b) and (c), first observe that since a is a rational function of ¢, we have 


F(a) C F(a)(t) = F(a,t) = F(t). 


In other words, F(t) is obtained by adjoining t to F(a). Thus [F(t):F(a)] is the 
degree of the minimal polynomial of t over F (cx). To find the minimal polynomial, we 
will use the relatively prime polynomials a(r), b(t) € F[t] appearing in a = a(t)/b(t). 
Consider the polynomial in x defined by a(x) — ab(x). Then: 

e a(x) — ab(x) is a polynomial in x with coefficients in F(a). 

e tis aroot of a(x) — ab(x) since a(t) — ab(t) = a(t) — a(t) =0. 

e If a(x) =aox"”+--- and b(x) = box” +--- have degrees n and m, then 


ax) — B(x) = (aox" +--+) — a (box +--+). 


None of the coefficients can cancel because a ¢ F. Hence the degree of x in 
a(x) — ab(x) is max(n,m) = max(deg(a),deg(b)). 
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Now suppose that a(x) — ab(x) is irreducible over F(a). Then the above bullets 
imply that 


[F (t): F(a)] = the degree of x in a(x) — ab(x) = max(deg(a),deg(b)). 


Thus part (c) of the proposition follows from part (b). 

To prove part (b), we begin in the polynomial ring F|x,y] for variables x,y. By 
Theorem 2.1.1, F [x,y] is a UFD. We first claim that a(x) — yb(x) is irreducible in 
F |x, y]. This is easy to see, for if 


a(x) —yb(x) =AB, A,BEF{x,y], 


then A and B can’t both have positive degree in y. We may assume that A € F[x]. 
In Exercise 1 you will show that this implies that A divides a(x) and b(x). Hence 
A is constant, since a(x),b(x) are relatively prime. This proves that a(x) — yb(x) is 
irreducible in F[x, y}. 

Now consider a(x) — yb(x) as a polynomial in F(y)|x], i.e., as a polynomial in 
x with coefficients in F(y). We claim that it is irreducible over F(y) because it is 
irreducible in F[x,y]. This can be proved several ways—Exercise 2 uses Gauss’s 
Lemma, and Exercise 3 gives a more elementary proof. 

To apply this to our situation, recall that @ is transcendental over F by part (a). 
This means that @ can be regarded as a variable over F. Hence y +> @ induces a 
ring isomorphism F(y)[x] ~ F(q)[x] that takes a(x) — yb(x) to a(x) —ab(x). Then 
the previous paragraph implies that a(x) — ab(x) € F(c)|x] is irreducible over F(a). 
This completes the proof of the proposition. a 


Here is an example that illustrates Theorem 7.5.3 and Proposition 7.5.5. 

Example 7.5.6 Consider the automorphisms o and 7 of C(t) defined by 
a(a(t)) =a(¢,t) and r(a(t)) =a(t74), 
where ¢, =e and a(t) is an arbitrary rational function in C(r). It is easy to see 
that o has order n and 7 has order 2. Furthermore, the computation 
toaor(a(t)) =Toa(a(t7')) = t(a((¢,t)~')) = t(a(Gy't7')) 
=a(¢,'t) =o07! (a(t) 

shows that T0707 =a7!. It follows that o and 7 generate a group G of automor- 


phisms of C(r) isomorphic to the dihedral group D2, of order 2n. If we let L = C(t), 
then Theorem 7.5.3 implies that 


2rijn 


Ig c C(t) 


is a Galois extension of degree 2n with Galois group isomorphic to D2,. 
To describe this extension more explicitly, let 
Prd 


i mE C(t). 
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Proposition 7.5.5 implies that C(t” +1—") c C(t) also has degree 2n. Since t" +17" 
is invariant under the action of o and 7, we have extensions 


Cer t+t-") CLg c C(t) 
where C(r) has degree 2n over both smaller fields. Thus C(t” +1~") = Lg, and 
C(t" +17") c Ct) 


is a Galois extension with Galois group isomorphic to D>,. <> 


C. Linear Fractional Transformations. Given a field F and a variable r, 
FC Fit) 


is an extension of infinite degree, so that the theory developed in previous sections 
doesn’t apply. But we can still define Gal(F (t)/F) to consist of all automorphisms 
of F(t) that are the identity on F. We will describe this group using matrices. 

Let GL(2, F) be the group of 2 x 2 invertible matrices with entries in F. Then 


(< 2) € GL(2,F), 


gives a rational function 
at+b 
ct+d 
This relates to Gal(F (t)/F) as follows. 


F(t). 


Theorem 7.5.7 Let F c F(t) be as defined above. 

(a) For y = (4%) € GL(2,F), the function 0, : F(t) + F(t) defined by a(t) > 
a( #48) is an automorphism that is the identity on F. Thus 0, € Gal(F (t)/F). 

(b) The map y+> 0-1 defines a group homomorphism GL(2, F) — Gal(F (t)/F). 

(c) The homomorphism of part (b) is onto, and its kernel consists of all nonzero 
multiples of the identity matrix. 


Proof: For part (a), we first observe that the evaluation map sending t to ate € F(t) 
induces a ring homomorphism 

Fit] — F(t). 
This map is one-to-one, since mre is transcendental over F by Proposition 7.5.5. 
Thus it extends to the field of fractions of F[t], which gives the map 


oy: F(t) — F(t) 


described in the statement of the theorem. Hence , is a one-to-one homomorphism 
(see Exercise 2 of Section 3.1). Furthermore, the image of a, is the subfield F (+2) . 


et+d 
By Proposition 7.5.5, the extension F (445) C F(t) has degree 


max (deg(at + b),deg(ct + d)) = 1, 


since a or c is nonzero. Thus F (4%) = F(t), so that , is onto. It follows that 


oy € Gal(F (t)/F), since it is obviously the identity on F. 
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This proves part (a) and shows that we have a map 
® : GL(2,F) — Gal(F(t)/F) 


defined by (7) = a,-1. The inverse in the definition of ® is needed to make it a 
group homomorphism, as you will prove in Exercise 4. Part (b) follows. 

For part (c), let o € Gal(F(t)/F). Then o(a(t)) = a(o(r)) for all a(t) € F(t), 
since o is the identity on F. This implies that F(o(t)) = o(F(t)) = F(t). Setting 
a(t) =A(t)/B(t), where A(t), B(r) € F[r] are relatively prime, we can write this as 
F (A(t)/B(t)) = F(t). By Proposition 7.5.5, it follows that 


max(deg(A),deg(B)) = 1. 


Thus A(t) = at +b and B(t) = ct +d, where a,b,c,d € F. In Exercise 5 you will 
show that 


a b 


(7.26) (: *) € GL(2,F). 


Since o(t) = ate it follows that 0 = 0.,-1, where +! is the matrix (7.26). This 
proves that ® is onto. 
Finally, take y € GL(2, F) in the kernel of ®. Then 7-1 = |i), so that 


' (1) at+b 
=dn- = 
ct+d 
in F(t), where y~' = (2°). Clearing denominators and collecting terms gives 


ct? + (d—a)t—b=0 


in F(t]. Thus c = b = 0 and a = d, which shows that ¥ is a nonzero multiple of the 
identity matrix. Since all nonzero multiples of the identity are in the kernel (check 
this), we get the desired description of the kernel. . 


If J is the 2 x 2 identity matrix, then F*/, consists of all nonzero multiples 
of the identity. Then the Fundamental Theorem of Group Homomorphisms and 
Theorem 7.5.7 imply that we have an isomorphism 


GL(2,F)/F*h ~ Gal(F(t)/F). 
The quotient group GL(2, F)/F*/, is denoted PGL(2, F). Thus 
PGL(2, F) ~ Gal(F(t)/F). 
The group PGL(2,F) will play an important role in what follows. The “PGL” in 


PGL(2,F) stands for projective linear group. We will learn more about projective 
linear groups in Chapters 13 and 14. 
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D. Stereographic Projection. Let F be a field. Then define 
F =F U {oo}, 


where oo is a formal symbol that stands for the “point at infinity” of F. Given 


v=(° i) €OL2.F) and aeéF, 


d 
we get [y] € PGL(2, F) = GL(2, F)/F*h; and set 
aa+b 
7.27 ‘a= . 
(7.27) le ca+d 


In Exercise 6 you will show that this gives a well-defined action of PGL(2,F) on F. 
Furthermore, in Exercise 7 you will prove the following. 


Proposition 7.5.8 Let (a1,02,03) and (8), 2, 83) be triples of distinct points of F. 
Then there is a unique [y| € PGL(2, F) such that 


[y]- a1 = Bi 
fori=1,2,3. a 


We will assume that F = C for the rest of the section. This will allow geometry 
to give some interesting Galois groups. It is customary to call C = CU {oo} the 
Riemann sphere because we can map the unit sphere S? C R? to Cc by stereographic 
Projection. 

The unit sphere S? is defined by x? + y? +z? = 1 in R®, and we identify a+ bie C 
with (a,b,0) € R?, so that C becomes the plane z = 0 in R?. Then define 


x: S*\ {(0,0,1)} -+C 


as follows. Given P € S* \ {(0,0,1)}, we draw the line connecting P to the north 
pole (0,0, 1) and define 7(P) to be the point where the line intersects the xy-plane. 
Here is the picture: 
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In this picture, the xy-plane is shaded gray and the top half of S? is shown as a wire 
frame. In Exercise 8 you will show that 


(7.28) (a,b,c) = ( a ., \= a b 


1c’ 1— + 


l—-c 1l-ec’ 


where the last equality uses the above identification of C with the xy-plane in R°. 
Under the map 7, the south pole (0,0,—1) maps to 0 € C and the equator of the 
sphere maps to the unit circle {z € C | |z| = 1}. 

We then extend 7 to stereographic projection 


*#:87 —3C 
by defining 7(0,0,1) = 00 and #(P) = 7(P) for P € S*\ {(0,0,1)}. Note that 7 is 
one-to-one and onto. 
The key geometric property of 7 is the following. Consider a rotation r of S? 
about some axis through the origin. This gives a map 


r:S? —> S?. 


We then obtain the map 
foro# !:C >C 


by composition. The remarkable fact is that this map is given by a linear fractional 
transformation. To state our result more precisely, let 


Rot(S”) 


be the group of all rotations of the sphere. A careful description of this group can be 
found in [9, Sec. 2.2], together with a proof of the following important result. 


Theorem 7.5.9 Given r € Rot(S?), there is a unique [y| € PGL(2,C) such that 
torok'(z) =[y]-z 
forallz€ C. Furthermore, the map 
Rot(S?) —+ PGL(2,C) 
defined by r > (7y] is a one-to-one group homomorphism. . 
Here is an example of this result. 
Example 7.5.10 Consider the octahedron with vertices at (+1,0,0), (0,£1,0), and 


(0,0,+1) in R*. This is inscribed in the sphere S? and when combined with stereo- 
graphic projection gives the picture 
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This shows the top half of the octahedron. Note that (1,0,0) becomes 1 € C and 
(0, 1,0) becomes i € C, so that the six vertices of the octahedron map to the points 


0,+1,+i,00 € €. 


Now consider the rotation r; of S? by 180° about the x-axis. By Theorem 7.5.9, 
this gives an element of PGL(2,C) that takes 1 to itself and interchanges 0 and oo. 


However, note that 
0 1 
‘= 1 0 


1 


Im] -z= = 
Zz 


satisfies 


for z € C. This also takes 1 to itself and interchanges 0 and oo. Since [y,] is uniquely 
determined by its values on 0,1,00 (this is Proposition 7.5.8), it follows that r; 
corresponds to [7,]. 

Similarly, consider the rotation rz of S? by 90° counterclockwise about the z-axis. 
In Exercise 9 you will show that rz corresponds to [7,], where 


i 0 
Ro ay: 


[yp] -z = iz 


Note that 


forz€ C. 

Finally, consider the rotation r3 of S? about the y-axis that takes (0,0, i) to (1,0,0). 
Under stereographic projection, this corresponds to an element |y,] € PGL(2,C) that 
fixes ti and maps 


co 71309 -lowf::.. 
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Be sure that you can see this in the above picture. In Exercise 9 you will show that 
1 -1 
By a)? 


z—l 
y= 
[ys] z+1 


Thus 


for ze C. 

In Exercise 10 you will show that the rotation group of the octahedron is isomorphic 
to S4 and is generated by the rotations 71, 7, and r3. By the isomorphism of 
Theorem 7.5.9, it follows that the subgroup 


G = (Im), by), [3]) C PGL(2,C) 


is isomorphic to $4. By Theorem 7.5.7, we can regard G as a group of automorphisms 
of L = C(t), and then Theorem 7.5.3 shows that 


Lg CL=C(t) 
is a Galois extension with Galois group G ~ Sq. <p> 
Example 7.5.10 will have an unexpected application in Chapter 14. 
Mathematical Notes 
There are many ideas in this section to discuss. 


s Finite Subgroups of Linear Fractional Transformations. Theorems 7.5.3 
and 7.5.7 show that any finite subgroup G C PGL(2, F) gives a Galois extension 


Lg CL=F(t) 


with Galois group G. The remarkable fact is that for many fields F, the finite 
subgroups G C PGL(2, F) have been classified. 
For example, when F = C, G is isomorphic to one of the following groups: 


Acyclic group C, of order n,n > 1 

A dihedral group D2, of order 2n, n > 2 
(7.29) The alternating group A4 of order 12 

The symmetric group S4 of order 24 

The alternating group As of order 60. 


Furthermore, two finite subgroups of PGL(2,C) are conjugate in PGL(2,C) if and 
only if they are isomorphic as abstract groups. Proofs of these assertions can be 
found in [2, Sec. 2.13] and [9, Secs. 2.2, 2.6]. 
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Even more remarkable is the geometric origin of these subgroups of PGL(2,C). 
We saw in Example 7.5.10 how the octahedron gave a subgroup G C PGL(2,C) 
isomorphic to 54. In a similar way, the tetrahedron and icosahedron give subgroups 
isomorphic to Ay and As, respectively. Furthermore, you will show in Exercise 12 
how to realize C,, and D2, as the symmetry groups of polyhedra. Thus polyhedra give 
Galois groups! 

Chapter 14 will discuss the subgroups of PGL(2, F) when F is a finite field. 


« Invariant Theory. Examples 7.5.6 and 7.5.10 gave extensions with Galois groups 
Dr, and S4, respectively. For D2, the extension was given explicitly as 


Ci" +17") C C(t), 
while for $4, we merely wrote 
(7.30) Ig CL=C(t), Gey. 


In fact, one can show that for the group G = ([¥,], [7], [4y]) C PGL(2,C) of Exam- 
ple 7.5.10, the extension (7.30) is given by 


c( eer") C(t). 


4(e4—1)4 
While the invariance of t? +17" under D2, is obvious, it is not at all clear that 


(18 + 1424+1)3 
(14 —1)4 


is invariant under G = ([+], [49], [43]). But even if a is invariant, how does one find 
a in the first place? The answer involves invariant theory and is beyond the scope 
of this book. But as a small hint of where a comes from, we note that t° + 142+ + 1 
has the following geometric description. The octahedron has eight faces that are 
equilateral triangles. If we project from the center of the sphere, the center of each 
face gives a point on S*. Under stereographic projection, these give eight points of C. 
Then the roots of t8 + 1414 + 1 are precisely these eight points. 

Invariant theory gives similar formulas for the extensions coming from the tetra- 
hedron and icosahedron. Complete details can be found in Chapter 3 of [9]. 


« Liiroth’s Theorem. The above paragraph suggests that for any finite subgroup 
Gc PGL(2,C), there is a € C(t) such that if L = C(t), then 


Leé = C(a). 


In fact, more is true: given any field F and any intermediate field F Cc K C F(t) 
with [F(t): K] < 00, there is a € F(t) such that K = F(a). This result is known as 
Liiroth’s Theorem. Proofs may be found in [6, Sec. 9.5] and [9, Sec. 6.3]. 


= The Quintic and the Icosahedron. When we use the rotations of the icosahedron, 
we get a Galois extension 
Ig CL= C(t ) 
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with Galois group G ~ As. By [9, Sec. 4.8], there is W € C(t) such that 


and C(W) c C(r) is the splitting field of the irreducible quintic 
x° — 10Wx? + 45W?x? — W? € C(W)[x]. 


This is called a Brioschi quintic. 

The results of Chapter 8 will imply that the Brioschi quintic is not solvable by 
radicals over C(W). However, in the nineteenth century it was discovered that 
(roughly speaking) the solution of any quintic can be reduced to solving the Brioschi 
resolvent. This involves a rich collection of ideas described by Klein in [4] and more 
recently in the book [9] and the poster [10]. 

The Galois group of an irreducible quintic will be computed in Chapter 13. 


Exercises for Section 7.5 


Exercise 1. Let P,Q € F [x,y] be polynomials such that P|Q and P € F[x], and write Q = 
a(x) +41 (x)y + ao(x)y? +--+ + am(x)y™. Prove that Pla; for i=0,...,m. 


Exercise 2. In the proof of Proposition 7.5.5, we showed that a(x) — yb(x) is irreducible in 
F [x,y] and we want to conclude that it is also irreducible in F(y)[x]. Prove this using the 
version of Gauss’s Lemma stated in Theorem A.5.8. 


Exercise 3. The proof of Proposition 7.5.5 shows that a(x) — yb(x) is irreducible in F[x,y}. In 
this exercise, you will give an elementary proof that a(x) — yb(x) is irreducible over F(y) [x]. 
Suppose that 
a(x) —yb(x) =AB, A,BE F(y)[x]. 
You need to prove that A or B is constant, which in this case means that A or B lies in F(y). 
(a) Show that there are nonzero polynomials g(y),h(y) € F[y] that clear the denominators of 
A and B, i.e., g(y)A = A; and h(y)B = B, for some Ai, B € F [x,y]. 
(b) Show that g(y)h(y) (a(x) — yb(x)) = A1Bi in F[x,y] and explain why a(x) — yb(x) must 
divide either A, or B; in F [x,y]. 
(c) Assume that A; = (a(x) — yb(x))A2, where Az € F [x,y]. Show that this implies that 
g(y)h(y) = A2Bi, and then conclude that B, € F[y]. 
(d) Show that B € F(y). 


Exercise 4. Prove that the map ® : GL(2,F) — Gal(F(t)/F) defined in the proof of Theo- 
rem 7.5.7 is a group homomorphism. 


Exercise 5, Prove (7.26). 


Exercise 6. In this exercise, you will prove that PGL(2, F) acts on F = F U {oo}. 
(a) First show that 
aa watt yx ¢ i) 
catd’ c d 


defines an action of GL(2, F) on F. Explain carefully what happens when a = oo. 
(b) Show that nonzero multiples of the identity matrix act trivially on F, and use this to give 
a careful proof that (7.27) gives a well-defined action of PGL(2, F) on F. 
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Exercise 7. Proposition 7.5.8 asserts that we can map any triple of distinct points of F to any 

other such triple via a unique element [y] € PGL(2,F). We will defer the proof of existence 

of {y] until Exercise 24 in Section 14.3. In this exercise, we will prove the uniqueness part of 

the proposition, since this is what is used in Example 7.5.10. 

(a) First suppose that [y} € PGL(2, F) fixes oo and also fixes two points a 4 8 of F. Prove 
that -+y is a nonzero multiple of the identity matrix. 

(b) Now suppose that [y] € PGL(2, F) fixes three distinct points of F, and let a be one of 
these points. Show that there is [5] € PGL(2,F) such that [4]- a = oo. Then prove that 
y is a nonzero multiple of the identity matrix by applying part (a) to [6-y6~ |]. 

(c) Show that the desired uniqueness follows from parts (a) and (b). 


Exercise 8. Prove the formula (7.28) for stereographic projection. 


Exercise 9. In Example 7.5.10, we considered rotations 1, r2, 73 of the octahedron and defined 
matrices ‘y,,y,,¥, € GL(2,C). We also proved carefully that r; corresponds to [y,] under the 
homomorphism of Theorem 7.5.9. In a similar way, prove that r2 corresponds to [7,] and r3 
corresponds to [7,]. 


Exercise 10. The goal of this exercise is to prove that the symmetry group G of the octahedron 
is isomorphic to Sy. By symmetry group, we mean the group of rotations that carry the 
octahedron to itself. We think of G as acting on the octahedron. 
(a) Let v be a vertex of the octahedron. Use the action of G on v and the Fundamental 
Theorem of Group Actions to prove that |G| = 24. 
(b) The eight face centers of the octahedron form the vertices of an inscribed cube. Explain 
why the octahedron and its inscribed cube have the same symmetry group. 
(c) The cube has four long diagonals that connect a vertex to an opposite vertex. Explain 
why the action of G on these diagonals gives a group homomorphism G — Sq. 
(d) Let ri,r2,r3 € G be the rotations described in Example 7.5.10. Explain how each rotation 
acts on the inscribed cube and describe its corresponding permutation in Sq. 
(e) Prove that the three permutations constructed in part (d) generate Sy. 
(f) Use parts (a) and (e) to show that G ~ S4. Also prove that G is generated by 11, 12,73. 
See Section 14.4 for a different approach to proving that a group is isomorphic to Sq. 


Exercise 11. In Section 6.4, we defined the one-dimensional affine linear group AGL(1,F,) 
over the finite field F,. More generally, if F is any field, then AGL(1,F) consists of all 
functions +, , : F — F defined by 


Yap@)=aatbh, aeF, 


where a € F*, b € F, and the group structure is given by composition. In this exercise, you 
will represent AGL(1, F’) as a subgroup of PGL(2, F). 
(a) Show that the map -, ,++ [4°] defines a one-to-one group homomorphism 


AGL(1,F) —> PGL(2,F). 


(b) Consider the action of PGL(2,F’) on F. Show that the isotropy subgroup of PGL(2, F) 
acting on oo is the image of the homomorphism of part (a). 


Exercise 12. In this exercise, you will construct polyhedra whose symmetry groups are 
isomorphic to C, and Don. For D2n, consider the polyhedron whose vertices are the north and 
south poles of S? together with the nth roots of unity along the equator. For n = 8, this gives 
the following picture: 
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As usual, this shows only the top half of the polyhedron. Note that to obtain a three-dimensional 
object, we must assume n > 3. 

(a) Show that the symmetry group of this polyhedron is isomorphic to Dz, when n # 4 and 
Sq when zn = 4. 

(b) Now take the vertices on the equator and move them up in S? so that they become the 
vertices of a regular n-gon lying in the plane z = c, where c > 0 is small. Prove that the 
symmetry group of this polyhedron is isomorphic to Cy. 

(c) Find polyhedra inscribed in S? whose symmetry groups are C| (the trivial group), C2, Da 
(the Klein four-group), and Dg, respectively. 

Notice that the symmetry groups of these polyhedra, together with those of the tetrahedron, 
octahedron, and icosahedron, give all of the groups listed in (7.29). 


Exercise 13. Consider the automorphism of L = C(t) defined by a(t) + a(¢,t). This generates 
acyclic group G of automorphisms such that |G| = n. Adapt the methods of Example 7.5.6 
to show that Lc = C(t”) and conclude that C(t”) C C(t) is a Galois extension whose Galois 
group is cyclic of order n. 


Exercise 14. Consider the automorphisms of L = F(t) defined by 
a(a(t)) =a(t7') and r(a(t)) =a(1-t). 


(a) Prove that o and 7 generate a group G of automorphisms of F(t) isomorphic to S3. 


(b) Show that G corresponds to the subgroup of PGL(2, F) consisting of all elements that 
map the subset {0,1,00} C F to itself. 


(c) Prove that 
2 3 
Lg = F( os 7 ), 


F(S22Y) CF) 


(1-12 


is a Galois extension with Galois group G ~ S3. 


and conclude that 
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PART Iil 


APPLICATIONS 


The next four chapters give classic applications of Galois theory. 

Cardan’s formulas lead to the notion of solvability by radicals. In Chapter 8, we 
relate this to the notion of a solvable group and use Galois theory to show that in 
general, polynomials of degree > 5 cannot be solved by radicals. 

Chapter 9 discusses the Galois theory of the cyclotomic extension Q C Q(¢,), 
where ¢,, is a primitive nth root of unity. We also explain how Gauss analyzed this 
extension when n is a prime. 

In Chapter 10, we study straightedge-and-compass constructions from the point 
of view of Galois theory. This includes the classic Greek problems (trisecting the 
angle, duplicating the cube, squaring the circle) as well as the construction of regular 
polygons. We also explore what happens when we go beyond straightedge and 
compass to allow constructions using origami. 

Finally, Chapter 11 explores the theory of finite fields. We consider the structure 
of finite fields and compute the Galois groups involved. We also describe irreducible 
polynomials and cyclotomic polynomials over a finite field. 


CHAPTER 8 


SOLVABILITY BY RADICALS 


In this chapter, we will use the Galois theory developed in Chapter 7 to determine 
when a polynomial equation can be solved by radicals. The idea is to translate the 
problem into group theory. Hence we begin with the group-theoretic concept of 
solvable group. 


8.1 SOLVABLE GROUPS 


Here is the basic definition of this section. 
Definition 8.1.1 A finite group G is solvable if there are subgroups 
{e} =G, C Gr C-CG,; CGg=G 


such that fori =1,...,n we have: 
(a) G; is normal in G;_|. 
(b) [G;-1: G;] is prime. 


Since G; is normal in G;_1, part (b) of the definition can be replaced by the 
equivalent assertion that G;_, /G; is a cyclic group of prime order. 
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We will show below that every finite Abelian group is solvable. Here is an example 
of a non-Abelian solvable group. 


Example 8.1.2 The subgroups 


{e} CA3 CS; 
show that 53 is solvable, since each subgroup is normal in the next and the indices 
[A3: {e}] = 3, [S3: A3] = 2 are prime. <p 


In Exercise | you will prove similarly that A4 and S4 are solvable. On the other 
hand, we will see in Section 8.4 that A, and S, are nonsolvable for n > 5. 
Here is our first result about solvability. 


Proposition 8.1.3 Every subgroup of a solvable finite group is solvable. 


Proof: Let G be finite and solvable with subgroups G; as in Definition 8.1.1. Given 
a subgroup H Cc G, set Hj = G; NA and note that 


Hy = GoNH =GNH =H, 

H, =G,NH = {e} NH = fe}. 
Then consider the group homomorphism 

mT: Hi) — Gi_1/G; 
that sends h € Hj_, to the coset hG; € G;_,/G;. Observe that h € H;_, is in the kernel 
of 7 if and only if hG; = G;, which happens if and only if 
he Hi-1NG; = (Gi_.NH)NG; = ANG; = Ai, 

where the second equality follows from G; C G;_1. This shows that Ker(7) = H;. 


Thus H; is normal in H;_,. By the Fundamental Theorem of Group Homomorphisms, 
we get an isomorphism 


Ay /H; = Hj_, /Ker(7) ~ Im(7) Cc Gi_1/G;. 


Since G;_;/G; is cyclic of prime order, it follows that H;_1/H; is either trivial or 
isomorphic to G;_,/G;. Thus either Hj; = H; or [Hj— :Hj| is prime. Hence, by 
discarding duplicates, the subgroups 


{eh} =H, C--- CAH, CH) C-:-CH =H 


show that H is solvable. r 


Here is one of the main theoretical tools for dealing with solvable groups. 


Theorem 8.1.4 Let G be a finite group and H anormal subgroup. Then G is solvable 
if and only if H and G/H are. 


Proof: First suppose that Gis solvable. Then #/ is solvable by Proposition 8.1.3. To 
show that G/H is solvable, suppose that G; are subgroups of G as in Definition 8.1.1, 
and let x : G > G/H be the group homomorphism g ++ gH. Then let G; = 7(G;). In 
Exercise 2 you will show the following: 
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(a) Go = G implies that Go = G/H. 

(b) G, = {e} implies that Gn = {eH } (where eH is the identity of G/H). 

(c) G; normal in G;_; implies that Gi is normal in Gi I- 

(d) The map G;_,/G; > Gi- /G; given by gG; +> x(g )Gi is a well-defined onto 
group homomorphism. 

By assumption G;_1/G; is a group ¢ of prime order. In Exercise 2 you will show that 

this fact and (d) imply that Gi i/ G; is either trivial or also has prime order. Thus 

either G;_, = G; or [Gi 1! :GiJi is prime. Then, as in the proof of Proposition 8.1.3, 

discarding duplicates among the subgroups 


{eH} =GnC +» CGC G_-1 C--- C Go =G/H 


shows that G/H is solvable. 

Conversely suppose that H and G/H are solvable. Let H;,i=0,...,@be subgroups 
of H satisfying Definition 8.1.1, and similarly let G j J =0,...,m be subgroups of 
G/H satisfying the definition. 

As above, we have 7 : G > G/H. Given a subgroup K Cc G/H, set 


(8.1) na '(K) ={g €G| x(g) € K}. 


In Exercise 3 you will verify that 7—'(K) is a subgroup of G. You will also check 
that the kernel of 7 is 
H = '({eH}) 


and that 
G=n '(G/H). 


If we apply this to the subgroups 
{eH} = Gn GC --- C Go =G/H, 
then we obtain the subgroups 
H =n! ({eH}) =n "(Gn) C+ Cw! (Go) = 21 (G/H) = 
However, we also have the subgroups 
{e}=He C---CH, CH C--- CH =H. 


We “glue together” these sequences of subgroups by defining G; C G to be 


C= m"(G), O<i<m, 
| Him, = mS i<f+m. 


Note that the sequences are “joined” at Gy, = 77 !(Gm) = Ho = H. Itremains to show 
that G; is normal in G;-, of prime index. For m <i < &+-m, this is obvious, since 
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G; = Hj_m in this range and the 4; satisfy Definition 8.1.1. For 0 <i < m, we leave 
it as Exercise 4 to show that for this range of indices, G; is normal in G;_; and that 


(8.2) Gi-1/Gi & Gi-1/G;. 


Since the G; satisfy Definition 8.1.1, it follows that G is solvable. This completes the 
proof of the theorem. a 


We next use Theorem 8.1.4 to show that Abelian groups are solvable. 


Proposition 8.1.5 Every finite Abelian group G is solvable. 


Proof: We will prove the proposition by complete induction on n = |G|. The case 
n= | being trivial, assume that G is an Abelian group of order n > 1 and that the 
result is true for all Abelian groups of order < n. 

Let p be a prime divisor of |G]. If p = |G|, then G is cyclic of order p and hence 
solvable. If p < |G|, then by Cauchy’s Theorem (Theorem A.1.5), we can find g € G 
of order p. Now let H = (g) be the subgroup generated by g. Then H is normal, since 
G is Abelian and |H| = p < |G. It follows that the orders of H and G/H are strictly 
smaller than |G| =. Hence H and G/H are solvable (by our inductive assumption), 
and then Theorem 8.1.4 implies that G is solvable. a 


Here is an interesting non-Abelian solvable group. 


Example 8.1.6 The one-dimensional affine linear group AGL(1,F,) over F, was 
introduced in Section 6.4. There, the discussion leading up to (6.6) showed that 
AGL(1,F,) has a normal subgroup T ~ F, with quotient AGL(1,F,)/T ~ F>. 
Since F, and FF are Abelian, they are solvable by Proposition 8.1.5, so that 
AGL(1,F,) is solvable by Theorem 8.1.4. We also know that AGL(1,F,) is non- 
Abelian for p > 3 by part (a) of Exercise 10 of Section 6.4. This example will be 
important in Chapter 14. <p 


Mathematical Notes 


The definition of solvability is related to the ideas of simple groups, composition 
series, and the Jordan—Hélder Theorem. We will say more about these topics in 
Section 8.4. However, some standard results used to study solvable groups need to 
be mentioned here. 


= Solvability and the Order of a Group. In some cases the solvability of a group is 
determined by its order. For example, in Exercise 5 you will prove the following. 


Theorem 8.1.7 [f p is prime, then every group of order p", n > 0, is solvable. s 
In 1904, Burnside [4} generalized Theorem 8.1.7 as follows. 


Theorem 8.1.8 Jf p and q are distinct primes, then every group of order p"q™, 
n,m > 0, is solvable. a 
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In 1963, Feit and Thompson [5] proved the following surprising result. 
Theorem 8.1.9 Every group of odd order is solvable. 7 


In spite of its simple statement, the proof of Theorem 8.1.9 uses some very 
sophisticated mathematics and takes 255 pages. 


= Solvability and the Sylow Theorems. The Sylow Theorems imply some nice 
results about solvability. As stated in Theorem A.5.1, we have: 


e (First Sylow Theorem) If p” is the highest power of a prime dividing the order of 
a finite group G, then G has a subgroup of order p”, called a p-Sylow subgroup. 

e (Second Sylow Theorem) All p-Sylow subgroups of G are conjugate in G. 

e (Third Sylow Theorem) If G has N p-Sylow subgroups, then N = | mod p and N 
divides |G]. 

Here are two examples of how the Third Sylow Theorem can be used to prove that a 

given group is solvable. 


Example 8.1.10 Let G be a group of order 14, and let N be the number of 7-Sylow 
subgroups of G. Then N = | mod? and N|14 by the Third Sylow Theorem. It 
follows easily that N = 1, so that G has a unique 7-Sylow subgroup H. Since any 
conjugate of H is also a 7-Sylow subgroup, H coincides with its conjugates. Thus 7 
is normal, and then Theorem 8.1.4 easily implies that G is solvable. <p 


Example 8.1.11 Let G have order 42. Arguing as in Example 8.1.10, one sees that 
G has a normal 7-Sylow subgroup H. Then G/H has order 6, so that G/H ~ Z/6Z 
or $3, both of which are solvable. Hence G is solvable by Theorem 8.1.4. -— 


In Exercises 6 and 7 you will combine similar arguments with Burnside’s Theorem 
(Theorem 8.1.8) to show that all groups of order < 60 are solvable. 


Exercises for Section 8.1 


Exercise 1. Consider the groups A4 and S4. 
(a) Show that {e, (12)(34), (13)(24), (14)(23)} is a normal subgroup of S4. 
(b) Show that Aq and Sq are solvable. 


Exercise 2. This exercise is concerned with the first part of the proof of Theorem 8.1.4. 
(a) Prove assertions (a)—-(d) made in the proof of the theorem. 
(b) Suppose that @ : M; — M2 is an onto group homomorphism. If |Mi| = p, where p is 
prime, then prove that |M2| = 1 or p. 
(c) Explain how part (b) proves the assertion made in the text that Gi-1 / G; either is trivial or 
has prime order. 


Exercise 3. Consider the map 7 : G — G/H used in the proof of Theorem 8.1.4. Given a 
subgroup K C G/H, define 7~'(K) as in (8.1). 

(a) Show that 7~'(K) is a subgroup of G containing H. 

(b) Show that H is the kernel of x and that H = 77! ({eH}). 

(c) Show that G = 27'(G/H). 
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Exercise 4. In the situation of (8.2), prove that G; is normal in G;_; and that gG; > (g)Gi 
gives the isomorphism (8.2). 


Exercise 5, In this exercise, you will prove Theorem 8.1.7. We begin with a classic result 
from group theory about the center of a group of prime power order. Recall that the center of 
a group G is the subset 


Z(G) = {g € G| gh = hg for all h € G}. 


Most courses in abstract algebra prove that Z(G) 4 {e} when |G| = p”, p prime (see, for 
example, [Herstein, Thm. 2.11.2]). You may assume this result. 

(a) In any group, show that (g) is normal for all g € Z(G). 

(b) Prove Theorem 8.1.7 using induction on n, where |G| = p” and p is prime. 


Exercise 6. In this exercise you will prove that groups of order 30 are solvable. 
(a) Use the method of Example 8.1.10 to prove that groups of order 10 or 15 are solvable. 
(b) Show that a group of order 30 is solvable if and only if it has a proper normal subgroup 
different from {e}. 
(c) Let G be a group of order 30. Use the Third Sylow Theorem to show that G has one or 
ten 3-Sylow subgroups and one or six 5-Sylow subgroups. 
(d) Show that the group G of part (c) can’t simultaneously have ten 3-Sylow subgroups and 
six 5-Sylow subgroups. Conclude that G must be solvable. 
See [Herstein, Sec. 2.12, Ex. 7] for further details on the structure of groups of order 30. 


Exercise 7. Use Burnside’s p"q” Theorem (Theorem 8.1.8) to show that groups of order < 60 
are solvable, with the possible exception of groups of order 30 or 42. When combined with 
the previous exercise and Example 8.1.11, this implies that groups of order < 60 are solvable. 
In Section 8.4 we will prove that As is not solvable. Since As has order 60, it is the smallest 
nonsolvable group. (One can show that As is the only nonsolvable group of order 60 up to 
isomorphism.) 


Exercise 8. Let G be a finite group, and suppose that we have subgroups 
{e}=Gric---CGo=G 


such that G; is normal in G;_; fori=1,...,n. 
(a) Prove that G is solvable if G;_; /G; is Abelian for i= 1,...,n. 
(b) Prove that G is solvable if G;_1/G; is solvable for i= 1,...,n. 


8.2 RADICAL AND SOLVABLE EXTENSIONS 


The purpose of this section is to introduce the field theory needed to study solvability 
by radicals. 


A. Definitions and Examples. The naive idea of solvability by radicals arises 
from polynomials such as x° + 3x + 1, whose unique real root is 


V1) 


by Example 1.1.1. This algebraic number is built by taking successive radicals. 
When we cast this in terms of fields, we are led to the following definition. 
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Definition 8.2.1 A field extension F C L is radical if there are fields 
F=FhyCKc::Ch_-}Ch,=L 
where fori=1,...,n, there is y; € F; with F, = F.-1(yi), y7"" € Fi-1, m; > 0. 


Notice that if we let bj = 7" € F,-1, then +; is an m;th root of b;. This allows us 
to write 7; = %/b;, so that 


F=F-1(%/bi), bie Fin. 
This shows that radical extensions are obtained by adjoining successive radicals. 
Here is our first example of a radical extension. 


Example 8.2.2 For the field extension Q c Q(V2+ V2), let y, = V2 and 7, = 
/2+-<2. Then we have the extensions 


QC QA) = Q(V2) C Q(V2)(%) = Q(v2)(v2+4 V2), 
where 72 = V2 =2 € Q and 73 = (2+ V2)" =2+ V2E Q(v2). Since 
Q(V2)(V24 V2) = Q(V2+ v2) 
(be sure you can prove this), QC Q(V2+ V2) is a radical extension. <> 


An important observation is that some extensions are not radical but are contained 
in larger radical extensions. Here is an example. 


Example 8.2.3 Let Q CL be a splitting field of f = x° +x? —2x—1 € Q[x]. In 
Example 7.4.3, we showed that f is irreducible over Q with discriminant 


A(f) =49=77 >0. 


By Theorem 1.3.1, the roots of f are real, which allows us to assume that LC R. 
Furthermore, since A(f) is a perfect square, Proposition 7.4.2 implies that Q C L is 
a Galois extension of degree 3. Cardan’s formulas imply that Q C L is contained in 
a radical extension (see also Exercise 1). 

However, the extension Q C L is not radical itself. We prove this as follows. If 
Q CL were radical, then [L: Q] = 3 would imply that L = Q(7), where y” € Q for 
some m > 3 (see Exercise 2 for the details). Then the minimal polynomial f of > 
over Q would divide x” —-y” and have degree [L: Q] = 3. Since Q C Lis Galois, f 
would split completely over Q(-+), so that three of 7,¢,7,¢27,...¢™~!y would lie 
in L. This is impossible, since L C R. Hence Q C L is not radical. <p 


This example motivates the following definition. 


Definition 8.2.4 A field extension F C L is solvable (sometimes called solvable by 
radicals) if there is a field extension LC M such that F C M is radical. 


198 SOLVABILITY BY RADICALS 


In this terminology, the extension Q C L considered in Example 8.2.3 is solvable, 
since it is contained in a radical extension. 


B. Compositums and Galois Closures. In order to understand radical and 
solvable extensions, we need to define the compositum of two or more subfields. 


Definition 8.2.5 Suppose that we have a field L and two subfields K, C Land Kz C L. 
Then the compositum of K, and K2 in L is the smallest subfield of L containing K, 
and K>. We denote the compositum by K, K». 


In Exercise 3 you will show that the compositum always exists and that the 
compositum of K; = F(a ,...,@) C Land Ky = F(fj,...; 8m) C Lis 


(8.3) K, Kz = F(qy,...,Qn,B1,+-+, 8m): 


For example, the compositum of Q(/2) and Q(V3) in R is Q(V2, V3). 

We next consider Galois closures. Proposition 7.1.7 proves that every finite 
separable extension F C L has a Galois closure, which may be thought of as the 
smallest Galois extension of F containing L. We can express the Galois closure as a 
compositum as follows. 


Proposition 8.2.6 Suppose that F C L C M where F C M is Galois. Then the 
compositum of all conjugate fields of L in M is the Galois closure of F C L. 


Proof: The Theorem of the Primitive Element implies that L = F(a) for some 
a€L. Since F C M is Galois, the minimal polynomial h of a over F is separable and 
splits completely over M, say h(x) = (x—a)---(x—a,), where a, = a. It follows 
that 


K = F(qy,...,@,) 
is a Galois extension of F containing L. In Exercise 4 you will prove that F C K 


is the Galois closure of F C L. In Exercise 4 you will also show that the conjugate 
fields of Lin M are F(aq;) for i= 1,...,r. Then (8.3) implies that 


F(a1)---F(a,) = F(ay,...,Q,) = K. 
This proves that K is the compositum of the conjugate fields of L in M. a 


C. Properties of Radical and Solvable Extensions. We begin with the 
following useful lemma. 


Lemma 8.2.7 
(a) IfF CLandL C M are radical, then so is F C M. 


(b) If we have F C K, C L and F C Ko CL such that F C Ky is radical, then 
K>2 C K\K3 is radical. 


(c) Ifwe have F C K, C LandF C K, C Lsuch that F C K, andF C Kz are radical, 
then F C K\Kz is radical. 
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Proof: Part (a) follows easily from the definition of radical extension by combining 
the sequences of fields used for F C Land Lc M. We omit the details. 

For part (b), we have fields F = Fp CF, C--- C F,_; C F, = K, such that F; = 
F;_\(yi), where y;"" € F;_; for 1 <i <n. Then define fields 


Fy = Ko, 


(8.4) F = Fal), 


Fy = Fri (4): 
In Exercise 5 you will use F C K» and induction to show that 
(8.5) F.C F; 
for i= 0,...,n. This in turn implies that 
yn ERA CK 


fori=1,...,n. It follows easily that Ky = Fy C --- C F/ is radical. In Exercise 5 you 
will show that F/ is the compositum K, K2, which will prove part (b). 

Finally, for part (c), note that Ky C K,K2 is radical by part (b). Then we are done 
by part (a), since F C K» is radical by assumption. " 


We next use Proposition 8.2.6 to study the Galois closure of a radical extension. 


Theorem 8.2.8 [fan extension F C Lis separable and radical, then its Galois closure 
is also radical. 


Proof: Find anextension L C M such that F C M is Galois (such an extension exists 
by the existence of Galois closures). Given o € Gal(M/F), we get the conjugate 
field F CoL CM. Exercise 6 shows that F C aL is radical because F C L is radical. 

But once we know that each conjugate field is radical over F, Lemma 8.2.7 tells us 
that their compositum is also radical over F. Then we are done, since the compositum 
is the Galois closure by Proposition 8.2.6. . 


The following corollary of Theorem 8.2.8 will be used in Section 8.5. 


Corollary 8.2.9 Ifa finite extension F C L of characteristic 0 is solvable, then so is 
its Galois closure. 


Proof: Since F Cc Lis solvable, we have F C LC L’ such that F C L’ is radical. 
Furthermore, F C L’ is separable (we are in characteristic 0) and hence has a Galois 
closure F C L’ C M. Then F C M is radical by Theorem 8.2.8. 

Now consider F C LC M. Since F C M is Galois, it contains the Galois closure 
of F C L by Proposition 8.2.6. Thus the Galois closure lies in the radical extension 
F CM, so that the Galois closure is solvable by definition. a 


In the next section, we will see how the solvable extensions defined here relate to 
the solvable groups studied in Section 8.1. 
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Historical Notes 


In 1824 Abel proved that the general quintic cannot be solved by radicals. He 
presented his proof in the privately printed Memoir on algebraic equations, in which 
is demonstrated the impossibility of solving the general equation of the fifth degree 
[Abel, pp. 28-33] that he sent to the leading mathematicians of Europe. In this 
memoir, Abel begins his proof as follows: 

Let 
y —ay'+by’—cy +dy-e =0 
be the general equation of the fifth degree and let us suppose that it is solvable 
algebraically, that is, one can express y by a function formed by radicals of the 
quantities a, b, c, d, and e. 
It is clear that in this case we can express y in the form 


1 2 m=! 
y= pt piR™ + poR™ +--+ pm-iR™ , 


m being a prime number and R, p, pi, p2, etc. functions of the same form as y, 
and so on until we come to rational functions of the quantities a, b, c, d, and e. 


(This is from the English translation in [9, pp. 155-169].) Abel’s description of y 
in terms of radicals is a “top-down” version of the definition of radical extension. 
Definition 8.2.1 is a “bottom-up” approach that begins with the smallest field (here 
containing the coefficients a, b, c, d, e) and successively adds radicals. Abel instead 
begins with the largest field (here containing y) and successively strips away radicals. 
Another difference is that Abel focuses on individual elements rather than the fields 
in which they lie. Nevertheless, the above quotation contains a clear description of a 


radical extension. Be sure you understand this. 
The reader may wonder why Abel assumes that m is prime in R=. We will see in 


Lemma 8.6.2 that this is no restriction. Also note that Abel’s “solvable algebraically” 
means “solvable by radicals” in modern terms. We will say more about Abel’s proof 
in Sections 8.5 and 12.1. 


Exercises for Section 8.2 


Exercise 1. As in Example 8.2.3, let L be a splitting field of x° +x? —2x—1 over Q. Also let 
G = eti/7 

(a) Show that the roots of x° +x? — 2x — 1 are 2cos(2jm/7) = C1 +. 67! for j = 1,2,3. 

(b) Show that Q C LC Q(¢,), and explain why Q C Q(¢,) is radical. 


Exercise 2. In the situation of Example 8.2.3, assume that Q C L is radical. Prove that 
L=Q(y) where y” € Q for some m > 3. 


Exercise 3. Here you will prove two properties of compositums. 
(a) Prove that the compositum KK? exists. 
(b) Prove (8.3). 


Exercise 4. This exercise is concerned with the proof of Propostion 8.2.6. 
(a) Show that K = F(o),...,@,) is the Galois closure of F C L. 
(b) Prove that the conjugates of L in M are the fields F(a;) fori=1,...,r 


SOLVABLE EXTENSIONS AND SOLVABLE GROUPS 201 


Exercise 5. This exercise will complete the proof of part (b) of Lemma 8.2.7. 
(a) Prove (8.5). 
(b) Prove that the field Fy defined in (8.4) is the compositum Kj K2. 


Exercise 6. Suppose we have finite extensions F C L C M and o € Gal(M/F), and assume 
that F C Lis radical. Prove that F C aL is also radical. 


Exercise 7. Suppose that we have extensions F C K; C L and F C K2 CL such that F C ki 
and F C K2 are Galois. Prove that F C K1K is Galois. This will show that the compositum 
of two Galois extensions is again Galois. 


8.3. SOLVABLE EXTENSIONS AND SOLVABLE GROUPS 


The main question we will answer in this section is: When is a finite extension F c L 
solvable? Because of subtleties that can occur in characteristic p, we will make the 
following simplifying assumption: 


All fields appearing in this section will have characteristic 0. 
See Section 8.5 for an example to show what can go wrong in characteristic p. 


A. Roots of Unity and Lagrange Resolvents. Section A.2 shows that given 
one mth root of a complex number, we get the others by multiplying by the mth roots 
of unity. Since radical extensions involve taking mth roots, it makes sense that roots 
of unity will play an important role. However, the roots of unity in Section A.2 are 
complex numbers, while the fields considered here need not be subfields of C (even 
though they have characteristic 0). For this reason, we need to study roots of unity 
for arbitrary fields of characteristic 0. 

Given a positive integer m and a field L of characteristic 0, consider the splitting 
field of x” — 1 over L. In Exercise 1 you will show that x” — 1 has m distinct roots in 
its splitting field. These roots form a group under multiplication, which is cyclic by 
Proposition A.5.3. A generator ¢ of this group has the following two properties: 

© The m distinct roots of x” —1 are 1,¢,...,¢7 7}. 
e The splitting field of x” — 1 over Lis L(1,¢,...,¢7~!) = L(€). 
We call ¢ a primitive mth root of unity in this situation. We claim that 


(8.6) Lc L(C) is Galois and Gal(L(¢)/L) is Abelian. 


To prove this, note that L C L(¢) is Galois, since L(¢) is the splitting field of the 
separable polynomial x” — 1 € L[x]. Now suppose that 0,7 € Gal(L(¢)/L). Then o, 7 
are determined by their values on ¢, and since the roots of x” — 1 are 1,¢,...,¢77', 
it follows that o(¢) = ¢' and r(¢) = C/ for integers i, j. Thus 


a7 (6) =a(0/) = (o(Q))/ = (Cl = CY. 


A similar computation shows that 7a(¢) = ¢44 = C. Then or = Ta, since o7 and 
To are uniquely determined by their values on ¢. Hence Gal(L(¢)/L) is Abelian. 
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Given a Galois extension F C L and a primitive mth root of unity ¢, we get the 
extensions 


We can relate the solvability of the various Galois groups as follows. 


Lemma 8.3.1 Let F C L be a Galois extension, and ¢ be a primitive mth root of 
unity. Then F Cc L(¢) and F(¢) C L(C¢) are also Galois, and 


Gal(L/F) is solvable <=> Gal(L(¢)/F) is solvable 
<=> Gal(L(¢)/F(¢)) is solvable. 


Proof: In Exercise 2 you will prove that F Cc L(¢) is Galois, which implies that 
F(¢) C L(¢) are also Galois. To prove the first equivalence, we use the extensions 


FCLCL(C). 


Since F C L(¢) and F C L are Galois, Theorem 7.2.7 implies that Gal(L(¢)/L) is a 
normal subgroup of Gal(L(¢)/F) such that 


Gal(L/F) ~ Gal(L(¢)/F)/Gal(L(¢)/L). 


But Gal(L(¢)/L) is Abelian by (8.6) and hence solvable by Proposition 8.1.5. Then 
Theorem 8.1.4 implies that Gal(L(¢)/F) is solvable if and only if Gal(L/F) is. This 
proves the first equivalence of the lemma. 

For the second equivalence, consider the extensions 


FCF(¢) CL(¢). 


Here, F C F(¢) is Galois by (8.6) (applied with F in place of L), so that, arguing as 
above, we get a group isomorphism 


Gal(F()/F) = Gal(L(6)/F)/Gal(L(0)/F (¢)). 


Also as above, Gal(F(¢)/F) is Abelian and hence solvable, and then Theorem 8.1.4 
implies that Gal(L(¢)/F) is solvable if and only if Gal(L(¢)/F(¢)) is. This proves 
the second equivalence of the lemma. a 


The following result will play a crucial role in our analysis of solvable extensions. 
The proof uses a clever construction due to Lagrange. 


Lemma 8.3.2 Suppose that K C M is a Galois extension with Gal(M/K) ~ Z/pZ, 
p prime. If K contains a primitive pth root of unity ¢, then there is a € M such that 
M = K(q@) and a? € K. 
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Proof: By hypothesis, Gal(M/K) is cyclic of order p. Let o € Gal(M/K) be a 
generator, and fix 8 € M\ K. Then, for each i = 0,..., p — 1, consider the Lagrange 
resolvent defined by 


(8.7) a4 = B+ CoB) + 6 07(B) + FCP) PNB). 
This easily implies that 
~4a(ai) = 6-4 0(B) + 6-%09(B) ++ P18) + 6? 0 (B). 
Since ¢? = | and g? is the identity, the final term on the right-hand side of the above 
equation simplifies to 6, and then the equation becomes 
¢'o(aj) =a, 
so that 
(8.8) a(aj) = Cay. 


Since ¢ € K and ¢? = 1, (8.8) easily implies that 


But o generates Gal(M/K), so that the above equation shows that a? is fixed by 
the Galois group. Hence aP € K, since K C M is Galois. Also, when i = 0, (8.8) 
becomes o(a) = ao, and then the argument just given shows that ao € K. 

Suppose for the moment there is some i between | and p — 1 such that a; 4 0. For 
these i’s, we also have ¢' ¥ 1, and it follows that C'a; 4 a;. Combining this with 
(8.8), we conclude that o(a;) 4 aj, so that a; ¢ K. This implies M = K(a;), since 
[M : K] is prime. Then a = a; has the desired properties, since a? € K. 

It remains to consider what happens if a; = 0 for all i= 1,...,p— 1. In this case, 
we add up the equations (8.7) for i=0,...,p— 1 to obtain 


9 = Ag tay t---+ap-1 
(8 + o(8) +0°(8) +---+0?~'(8)) 
+ (B+6-10(8)+6-207(8) +--+ CY a? 18) 
+ (B+ 67% 0(B) +6-407(B) +0 + C7PY oP “1(B)) +o 
+ (B+ C—O) 6B) 4 C772—) 62 (B) 4-0-4 ¢~P-D@-D gP-!(8)) 
= pB+(L+ C1407 4--+6-F )a(8) 
+ (14077 4.074 4 4 C72 D) G2 (B) 4 
+(1+ CH PD 4 C20“) 4g C—P-D(P1)) gP—1 (8), 


In Exercise 3 you will show that 


(8.9) L474 Co he CTP DEX 0 
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fori=1,...,p—1. It follows that the above formula for ao simplifies to 


ao = pB, 


so that 8 = ag/p (remember that we are in characteristic 0). However, we proved 
above that a € K, yet G ¢ K by assumption. This contradiction shows that at least 
one of a 1,...,@p,_1 is nonzero, which completes the proof of the lemma. a 


B. Galois’s Theorem. In Section 8.2, we showed that if F C L is solvable, then 
we can find an extension L Cc M such that F Cc M is Galois and solvable. For an 
arbitrary Galois extension, the wonderful fact is that the Galois group determines 
whether or not the extension is solvable. The following theorem due to Galois is one 
of the most important applications of Galois theory. 


Theorem 8.3.3 Let F Cc L be a Galois extension. Then the following are equivalent: 
(a) F C Lisa solvable extension. 
(b) Gal(L/F) is a solvable group. 


Proof: We prove (a) => (b) in three steps. 


Reduction to the Radical Case. Since F C Lis solvable, it lies in a radical extension 
F CL’. By Theorem 8.2.8, the Galois closure F Cc M of F CL’ is radical over F. 
Thus we have F C L C M where M is radical and Galois over F. 

Suppose for the moment that Gal(M/F) is a solvable group. Since F Cc L is 
Galois, Theorem 7.2.7 implies that we have an isomorphism 


Gal(L/F) ~ Gal(M/F)/Gal(M/L). 


Then Theorem 8.1.4 implies that Gal(L/F) is also solvable. Hence it suffices to 
prove that Gal(M/F) is solvable. In other words, we can assume that F C Lis radical 
and Galois. 


Adjunction of Roots of Unity. Suppose that F Cc Lis radical and Galois. If we adjoin 
a primitive mth root of unity ¢ to both F and ZL, then part (b) of Lemma 8.2.7 implies 
that the resulting extension F(¢) C L(C) is radical, since L(C) is the compositum of 
F(¢) and L. This extension is also Galois by Lemma 8.3.1. If we can show that 
Gal(L(¢)/F (C)) is solvable, then Lemma 8.3.1 will imply that Gal(L/F) is solvable. 
Hence we can assume without loss of generality that F contains any mth root of unity 
we want. 


Proof of Solvability. Since F C L is radical, we have subfields 


(8.10) F=FhCKhC:::Ch1Ch=L 


such that fori = 1,...,, we have F; = Fi_1(7i), where y;" € Fj_, for some m; > 0. 


By the previous step, we can also assume that F contains a primitive m;th root of 
unity ¢; fori = 1,...,n. In this situation, we claim that 


(8.11) F,_, C F; is Galois with cyclic Galois group. 
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To prove this, note that 1,¢;,... que are the distinct m,th roots of unity, which 


means that 
m1 


WAG % 
are the distinct roots of x” — 7," € Fi_;[x]. Since ¢; € F C F_1, we have 


Finn (Vis Gig Ys) = Fa (1): 


This shows that F;_; C F; = F;_;(7;) is Galois. The proof that the Galois group is 
cyclic is fun and is left to the reader as Exercise 4. This completes the proof of (8.11). 
We now prove solvability. Given the subfields (8.10), consider the subgroups 


G; = Gal(L/F;) C Gal(L/F). 
Since the Galois correspondence is inclusion-reversing, (8.10) gives 
{1} = Gal(L/L) = Gal(L/F,) = Ga C Ga-) 
Cc G, C Go = Gal(L/Fo) = Gal(L/F). 
Consider the extensions F;_; C F; C L. Then F;_; C L is Galois, since F;_; is 


an intermediate field of the Galois extension F C L. Furthermore, F;_; C F; is also 
Galois by (8.11). Hence Theorem 7.2.7 implies that G; is normal in G;-, with 


G;-1/G; = Gal(L/F_:)/Gal(L/F) ~ Gal(F;/F.-1). 


By (8.11), we conclude that G;_, /G; is Abelian. Since this is true for all i= 1,...,n, 
part (a) of Exercise 8 from Section 8.1 implies that Gal(L/F) is solvable. This 
completes the proof of (a) = (b). 

It remains to prove (b) => (a). We do this in two steps. 


A Special Case. Let F C L be Galois with solvable Galois group. Assume in addition 
that F satisfies the following special hypothesis: 


(8.12) F has a primitive pth root of unity for every prime p dividing |Gal(L/F)|. 


We will prove that F C L is radical in this situation. Since Gal(L/F) is solvable, 
we have subgroups {1,} = G, C --- C Go = Gal(L/F) as in Definition 8.1.1. Then 
consider the fixed fields 

F;=Lg, CL. 


Since the Galois correspondence is inclusion-reversing, this gives the fields 


F=Leat/rF) =La =P CMC: 
CF-1CK, =Le, = Lo} =L. 


Furthermore, since G; is normal in G;-;, the Galois correspondence together with 
Theorem 7.2.7 implies that 


Gi_-1/G; > Gal(F;/Fi_1). 
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Since [G;_ : Gi] is prime, Gal(F;/F;-1) ~ Z/pZ for a prime p. In Exercise 5 you 
will prove that p divides |Gal(L/F)|. By (8.12), F and hence F;_, contain a primitive 
pth root of unity. 

It follows that F;_; C F; satisfies the conditions of Lemma 8.3.2. Thus F; is 
obtained from F;_; by adjunction of a pth root of an element of F;_;. This proves 
that F Cc Lis a radical extension when F satisfies (8.12). 


The General Case. Finally, we consider what happens when we only assume 
that F C L is a Galois extension with solvable Galois group. In this situation, 
let ¢ be a primitive mth root of unity, where m = |Gal(L/F)|. By Lemma 8.3.1, 
Gal(L({¢)/F (¢)) is solvable since Gal(L/F) is. 

We relate the orders of these groups as follows. As in the proof of Lemma 8.3.2, 
we have an isomorphism 


Gal(L/F) ~ Gal(L(¢)/F )/Gal(L(¢)/L). 


If you look back at the proof of Theorem 7.2.7, you will see that this isomorphism 
comes from the homomorphism 


Gal(L(¢)/F) —+ Gal(L/F) 


given by restricting an automorphism of L({¢) to L. Since Gal(L(¢)/F(¢)) is a 
subgroup of Gal(L(¢)/F), we have a homomorphism 


(8.13) Gal(L(¢)/F(¢)) —> Gal(L/F) 


also given by restriction to L. But the kernel of this map is the identity, since elements 
of the kernel are the identity on both L and F(¢). Thus (8.13) is one-to-one, which 
by Lagrange’s Theorem implies that 


(8.14) m=|Gal(L/F)| isa multipleof |Gal(L(¢)/F(¢))|. 


Now let p be a prime dividing |Gal(L(¢)/F(¢))|. Then p divides m by (8.14). 
Since ¢ is a primitive mth root of unity, ¢”/? is a primitive pth root of unity (see 
Exercise 6). Since ¢”/? € F(C), we conclude that F(¢) C L(C) satisfies (8.12) with 
F and L replaced by F(¢) and L(¢), respectively. It follows that F(¢) C L(¢) is 
radical by the Special Case. But F C F(C) is obviously radical (¢” = 1 € F), so that 
F C L(C) is radical by part (a) of Proposition 8.2.7. 

Since F C L(C) is radical, the obvious inclusion L C L(¢) implies that L lies in a 
radical extension of F. Hence F C Lis solvable by definition. This completes the 
proof of the theorem. a 


The proof of Theorem 8.3.3 implies that a solvable Galois extension becomes 
radical after adjoining a suitable root of unity. Here is the precise result. 


Corollary 8.3.4 Let F Cc L be Galois and solvable, and let ¢ be a primitive mth root 
of unity, where m = [L: F]. Then F C L(C) is radical. 
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Proof: If F C L is Galois and solvable, then Gal(L/F) is solvable. The General 
Case of the proof of (b) = (a) in Theorem 8.3.3 shows that F C L(C) is radical, 
where ¢ is a primitive mth root of unity for m = |Gal(L/F)|. Then we are done, since 
|Gal(L/F)| = [L: F] for Galois extensions. 7 


Exercise 7 will give a more refined version of this result. 


C. Cardan’s Formulas. We conclude this section with a surprising application 
of Lagrange resolvents. Let F = Q(w), where w = e2"/3 is our usual cube root of 
unity. Note that w is primitive as defined at the beginning of the section. 

We will study the universal cubic 


f=x—ox*+0x-03 = (x —x1)(x — x2) (x — x3). 


If we regard this as a polynomial with coefficients in K = F(o1,02,¢3), then the 
splitting field of f is the universal extension in degree 3, 


K = F(o1, 02,03) CL = F(x,x2,%3), 


with Galois group Gal(L/K) = S3 (we identify an automorphism with the permutation 
it induces on the roots). As noted in Example 8.1.2, the subgroups 


{e} CA3 CS; 


show that 53 is solvable. Hence K C L is solvable. 

A more interesting picture emerges when we apply the proof of Theorem 8.3.3 
to this situation. Since (8.12) is satisfied, the Special Case tells us to take the fixed 
fields of the above subgroups. By Theorem 7.4.4, these fixed fields are 


KcCK(VA) CL, 


where A € K is the discriminant of f and /A = (x2 — x1) (x3 —X2)(%3 — x1). (This 
differs by a sign from the formula for V/A used in Theorem 7.4.4. However, it gives 
the same field K(/A) and leads to nicer formulas below.) 

Since K C K (VA) is clearly radical, we turn our attention to K(V/A) c L. Here, 
the Galois group is A3 ~ Z/3Z (be sure you know why). Since K = Q(w,01, 02,03) 
contains the primitive cube root of unity w, Lemma 8.3.2 implies that there is a € L 
such that 


L=K(VA)(a), a € K(VWA). 


To get an explicit formula for a, we use the Lagrange resolvent a; defined in (8.7) 
for the generator 0 = (123) of A3. Setting ¢ =w, 8 = x;, andi= 1 gives 


2 


a) =x +w!o-x+wo “Xj =X + wx. + wXx3, 


since w—! =w?. This formula for a relates nicely to Section 1.2: 
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@ In (1.10) of Section 1.2 we showed that 
A= F(x +w? x9 + wx3) 


is a root of the cubic resolvent (1.9). Up to the factor of i this is precisely a. So 
Galois theory explains where z; in (1.10) comes from! 

e Furthermore, recall that the roots of the cubic resolvent listed in (1.10) were 
obtained from z; by applying elements of S53. Thus (1.10) comes from z; by 
applying elements of Gal(L/K). It follows from (7.1) that the cubic resolvent is 
the minimal polynomial of z;. Galois theory explains the cubic resolvent! 


This is nice, but things get even better when we use our methods for computing 
symmetric polynomials. Namely, a? € K (VA) implies that aj =A+ BA for some 
A,B € F(o;,02,03). Since a is a polynomial in the x;, Exercise 3 of Section 7.4 
implies that A,B € F|o),02,03]. In Exercise 8 you will show that 


a} = a+ AVE=3(-0+ V9), 


where g = —207/27 + 0102/3 — 03. This allows us to write 


ay =x) +w*x +wWx; =3 : 3(-q+ SP). 


In Exercise 8 you will also show that if we set 3; = (23) - a), then 
By =x +wx3 +wx2 = 34/4 (-q-./34 


xp= (or +a,+ 1), 
(8.15) x = 4(o; + way +071), 


x3 = $(0; +w*a) +wf)). 


> 


and 


If you compare this with (1.8), (1.18), and (1.19) in Section 1.2, you will see we have 
derived Cardan’s formulas using Galois theory. 
We will say more about solving polynomials by radicals in Section 8.5. 


Historical Notes 


Solvable groups first appeared in Galois’s version of Theorem 8.3.3. Here is an 
extract from his statement of the theorem [Galois, pp. 57-59]: 


I first observe that to solve an equation, it is necessary to reduce its group 
until it contains only a single permutation .. . 


Given this, we will try to find the condition satisfied by the group of an 
equation for which it is possible to reduce the group [to a single permutation] by 
adjunction of radical quantities ... 
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The second part of the quotation refers to a radical extension. Furthermore, since the 
radicals need not lie in the splitting field of the polynomial, Galois is describing a 
solvable extension. 

The first part of the quotation explains the strategy used by Galois: As more 
radicals are adjoined, the field gets bigger, so that under the Galois correspondence 
the group gets smaller. Furthermore, if the splitting field is K C L, then the fixed 
field of {e} C Gal(L/K) is L. This means that when the group is reduced to “a single 
permutation,” we have found the splitting field. 

Galois wants to know the “condition satisfied by the group” in this situation. His 
method is to study how the Galois group changes under the adjunction of a pth root 
for some prime p. In [Galois, p. 59], he says the following: 


We can always suppose ... that included among the quantities adjoined 
earlier to the equation is a pth root of unity a... 


Consequently, by theorems II and III, the group of the equation should 
decompose into p groups that have in relation to one another the following double 
property: 1° that one passes from one to the other by a single permutation; 2° 
that they all contain the same substitutions. 


The first part of this quotation refers to adjoining roots of unity, just as we did in the 
proof of Theorem 8.3.3. The second part seems more obscure until one realizes that 
the “double property” 1° and 2° is Galois’s awkward way of saying normal subgroup. 
Then decomposing into “p groups” refers to cosets of the subgroup, so that we have 
a normal subgroup of prime index p. Since this happens for the radical adjunctions 
that reduce the Galois group to the identity, we see that the Galois group is solvable. 
This is the condition that Galois sought and is the first appearance of solvable groups 
in mathematics. 

Galois also asserts that the converse is true. The main point is Lemma 8.3.2, which 
Galois states as follows [Galois, pp. 59-61]: 


I say reciprocally that if the group of the equation can be partitioned into p 
groups that have this double property, one can, by a simple extraction of a pth 
root, and by adjunction of this pth root, reduce the group of the equation to one 
of the partial groups. 


The key step in his proof is a Lagrange resolvent, which Galois writes as 
0+ 00, +070. +---+0?10, 14, 


where 6,6;,...,9)—1 correspond to 8,a(8),...,0?~!(8) in (8.7) and a is a pth root 
of unity. Students usually find the proof of Theorem 8.3.3 to be straightforward, 
with the exception of the Lagrange resolvent (8.7)}—this seems to come out of the 
blue. Yet here is Galois using essentially the same resolvent with no explanation 
whatsoever. As we will learn in Chapter 12, Galois didn’t need to say anything, for 
Lagrange had worked out the theory of such resolvents in detail in 1770. 

One observation is that when Galois says “partitioned into p groups,” he seems to 
be using the term “group” for both a subgroup and its cosets. In fact, the situation is 
even more complicated, as we will see in Chapter 12. Given that we are at the birth 
of group theory, some confusion about terminology is understandable. 
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Exercises for Section 8.3 


Exercise 1. Let m be a positive integer, and let L be a field of characteristic 0. Then let L C M 
be the splitting field of x” — 1 € LI]. 

(a) Prove that x” — 1 is separable. 

(b) Prove that the roots of x” — | lying in M form a group under multiplication. 


Exercise 2. Assume that F C L is a Galois extension and that F has characteristic 0. Also, 
consider the extension L C L(¢) obtained by adjoining a primitive mth root of unity. Prove 
that F C L(¢) is Galois. 


Exercise 3. Prove (8.9), where ¢ is a primitive pth root of unity and 1 <i< p—1. 


Exercise 4. Consider the extension F;_; C F; of (8.11). In the discussion following (8.11), we 
showed that this extension is Galois. We now describe its Galois group. 
(a) Let o € Gal(F;/Fi-1). Show that there is a unique integer 0 < @ < m;— 1 such that 
oy) = 6% 
(b) Show that o ++ [€] defines a one-to-one homomorphism Gal(F;/F;-1) + Z/m;Z, where 
[] is the congruence class of £ modulo mj. 
(c) Conclude that Gal(F;/Fj_1) is cyclic. 


Exercise 5. Suppose that we have extensions F C Fj_1 C F; C L such that L is Galois over F 
and F; is Galois over F;_;. Prove that |Gal(F;/F;—1)| divides |Gal(L/F)|. 


Exercise 6. Let L be a field containing a primitive mth root of unity ¢ and let n be a positive 
divisor of m. Prove that ¢”/" is a primitive nth root of unity. 


Exercise 7. Let F C L be Galois and solvable (with F of characteristic 0). This exercise will 
consider a variation of Corollary 8.3.4. Let pi,..., pr be the distinct primes dividing [L: F]. 
(a) Show that F contains a primitive (pi --- p,)th root of unity if and only if F contains a 
primitive p;th root of unity fori =1,...,r. 
(b) Prove that F C L is radical when F contains a primitive (p1 --- p-)th root of unity. 
(c) Prove that F C L(¢) is radical, where ¢ is a primitive (pj) --- p-)th root of unity. 


Exercise 8. This exercise concerns the details of our derivation of Cardan’s formulas. 
(a) Use the computational methods of Section 2.3 to obtain the formulas for a} and ; stated 
in the text. 
(b) Prove (8.15). 


8.4 SIMPLE GROUPS 


Here is the key definition of this section. 
Definition 8.4.1 A group G is simple if its only normal subgroups are {e} and G. 
Some simple groups are easy to find. 


Example 8.4.2 If p is prime, then Lagrange’s Theorem implies that the cyclic group 
Z/pZ is simple. In Exercise 1 you will prove that these are the only nontrivial 
Abelian finite simple groups. << 
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Here are some more interesting simple groups. 
Theorem 8.4.3 The alternating group A, is simple for all n > 5. 


Proof: Our argument will use the following two properties of A,: 
e An l-cycle (i; i2---i,) lies in A, if and only if J is odd. 

e Ifn > 3, then A, is generated by 3-cycles. 

The first property follows from the identity (A.2) 


(i 2°) = (2) + (1B) (i) 


of Section A.1, which shows that an /-cycle is a product of / — | transpositions. The 
second property is less obvious and will be proved in Exercise 2. 

Suppose that H # {e} is a normal subgroup of A,. It suffices to show that 
H =A,,. To prove this, we first show that H contains a 3-cycle. By assumption H 
contains a nontrivial permutation o. We will create a 3-cycle in H by considering the 
decomposition of o into disjoint cycles. 

Since A, contains the 3-cycle (j; j2 3) and H C A, is normal, it follows that 


(8.16) o' (jt ja.j3) ‘oli ie js) EH. 


This will be useful because it will allow us to create some interesting elements of H. 
In Exercise 3 you will prove that the permutation (8.16) has the following property: 


If neither j nor o(j) lies in {j1, j2, js}, 


8.17 ee 
(8.17) then o~' (ji jo j3)~'o(Ai jz js) fixes j. 


This is important because the given permutation o € H might be very complicated, 
especially if n is large. But (8.17) shows that o—!(j; jz j3)~!o(j: jo j3) is a simpler 
permutation, since it moves at most six elements of {1,...,2}. Furthermore, this 
simpler permutation lies in H by (8.16). We will exploit this by making careful 
choices of the 3-cycle (1 j2 j3). 

We now prove that H contains a 3-cycle by considering the following cases. 


Case 1. First suppose that one of the cycles in o has length > 4, say 
o = (it inizig---)(--+) ++ 

In this case, we claim that 

(8.18) o7' (ig i3ig)~!o(in isis) = (irisia). 


By (8.17), this permutation fixes all j ¢ {i1, i2,i3,i4}, and from here it is easy to verify 
(8.18). We leave the details as part of Exercise 3. Since (8.18) and (8.16) imply that 
(i; iz ig) € H, we have the desired 3-cycle. 


Case 2. Next suppose that o has a 3-cycle. If o is a 3-cycle, then we are done. Hence 
we may assume that 


o = (ir iziz) (igis-+-) ++ . 
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We claim that 


(8.19) o—'(i2 3 is) ~'o(in iz is) = (it ig iz iz is). 


by Case 1. 


Case 3. Finally suppose that o is a product of disjoint 2-cycles. There must be at 
least two since 0 € H C Ag, So that 


o = (ij i2) (i314) (+++ )(42+) 


This time, we have 
(8.20) a" (iz i3 i4)~ "0 (in 3 4) = (i) 13) (in i) 


(see Exercise 3). As usual, this shows that (i, i3)(izi,) € H. To turn this into a 
3-cycle, let is be distinct from i), i2,i3,i4 (this is where we use n > 5). Then we 
compute directly that 


(i i3) (2 is) iz is)! ((éy #3) (int4)) (4 Bis) = (Hiss). 
Again we get a 3-cycle in H. 


Since every o # e in H must satisfy one of these three cases, we conclude that H 
contains some 3-cycle, say (ijk). We next claim that H contains all 3-cycles, since it 
is normal. To prove this, suppose that i’, j’,k’ are distinct, and let 6 be a permutation 
that satisfies 

i=", Aas, Ok) =k. 
An important property of permutations is that for any cycle (i) i2 ---i;), we have the 
identity 


(8.21) A (iy ig ++ i)O~! = (8(i1) O(i2) (2) 
(see Exercise 3). This implies that 
Oi jk)O-' =(i jk’). 


If 6 € Ay, then (i’ j’k’) € H, since H is normal in A,. On the other hand, if 9 ¢ Ap, 
then 6’ = 9(ij) € An. The above computation, performed using 6’ instead of 6, shows 
that (j’ 7 k’) € H (you should verify this carefully). Then (i’ j’k’) = (j’i’k’)—! EH, 
as claimed. 

Thus H contains all 3-cycles. At the beginning of the proof, we noted that A, is 
generated by 3-cycles. It follows immediately that H = A,, and we are done. rT] 


We next observe that non-Abelian finite simple groups are not solvable. 


Lemma 8.4.4 Let G be a non-Abelian finite simple group. Then G is not solvable. 
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Proof: Suppose that G is solvable. Then we can find a normal subgroup G; C Gp = 
G such that [G: G,] = [Go : G\] is prime. Since G is simple, we must have G, = {e}, 
since G, # G. Thus 


|G] = [G:G,]|G,| = [G: Gijl{e}| = [G: Gi], 


so that G has prime order. But this implies that G is cyclic and hence Abelian. The 
lemma follows by contradiction. a 


Combining Lemma 8.4.4 with earlier results gives us infinitely many nonsolvable 
groups as follows. 


Theorem 8.4.5 The alternating group A, and the symmetric group S, are solvable 
if and only ifn < 4. 


Proof: The cases n = 1,2 are trivial, and we saw in Example 8.1.2 and Exercise 1 
of Section 8.1 that $3 and S4 are solvable. By Proposition 8.1.3 it follows that A3 and 
Ag are solvable (this is also easy to prove directly). 

Now suppose that n > 5. Then A, is non-Abelian (the 3-cycles (123) and (124) 
don’t commute) and simple (by Theorem 8.4.3). By Lemma 8.4.4 we conclude that 
A, is not solvable for n > 5. Then Proposition 8.1.3 shows that S,, is also not solvable 
forn > 5. | 


For later purposes, we determine the normal subgroups of S,,. 


Proposition 8.4.6 [fn > 5 and H C S, is a normal subgroup, then either H = {e}, 
H =A,, or H = S,, 


Proof: If H is normal in S,, then H MA, is normal in A, (see Exercise 4). Since 
n > 5, Theorem 8.4.3 implies that HMA, is {e} or A,. In the latter case, we have 
A, CH, which easily implies that H = A, or S,, since [S,:A,] = 2. 
Finally suppose that HMA, = {e}. If H # {e}, then Exercise 5 will show that 
H = {e,o}, where 
o=(if)(--) 


is the product of an odd number of disjoint 2-cycles. Now pick k different from i and 
j, and let 6 = (jk). Then (8.21) implies that 
600—! = O(i j)0-'0(---)0-'0---67! 
= (ik)(:)e. 
This is still a product of disjoint 2-cycles. Since one of the cycles is (ik), it can’t 


equal o = (ij)(---)--». Thus 000! ¢ H, which is impossible, since H is normal. 
This contradiction shows that H = {e} and completes the proof. 7 


Mathematical Notes 


The relation between simple and solvable groups is more interesting than indicated 
by Lemma 8.4.4. The key observation is that all groups are “built” out of simple 
groups by means of what are called composition series. 
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= Composition Series and the Jordan—Hélder Theorem. Definition 8.1.1 says that 
a group G is solvable if we can find subgroups 


(8.22) {e} =G, CGr1 C++ CG, CGog=G 


such that G; is normal in G;_, and [G,;_1 : G;] is prime for i= 1,...,”. This implies 
in particular that the quotient G,;_, /G; is simple, since it has prime order. 

More generally, if G is a finite group, then a composition series of G consists of 
subgroups (8.22) such that G; is normal in G;_; and the quotient G;_;/G; is simple 
for all i. We call the G;_,/G; the composition factors of G. 


Example 8.4.7 Let n > 5. Since A, is simple, a composition series of S, is 
{e} CAn CS). 


The composition factors are A, /{e} ~ A, and S,/A, ~ Z/2Z. << 


It is straightforward to show that any finite group has a composition series (see 
Exercise 6). However, a given group may have more than one composition series. 
For example, the cyclic group Z/6Z = ([1]) has the composition series 


{e} c ([2]) C Z/6Z and {e} c ([3]) CZ/6Z. 


The factors of the first composition series are Z/2Z and Z/3Z, while the factors for 
the second are Z/3Z and Z/2Z. The Jordan—Holder Theorem asserts that any two 
composition series of a given group have the same length and that the corresponding 
composition factors can be permuted so that they become isomorphic. Hence the 
composition factors of a group are the simple groups from which the group is built. 
Here “built” refers to the extension problem discussed in the Mathematical Notes to 
Section 6.4. For more on composition series, see [Jacobson, Vol. I, Sec. 4.6]. 

In particular, a finite group is solvable if and only if its composition factors are the 
“simplest” simple groups, namely the Abelian ones. This shows that solvable groups 
form a very special class of groups. 


Historical Notes 


The term “simple group” is due to Jordan. He was the first to prove that A, is 
simple for n > 5. However, concerning As, Galois noted in 1832 that “the smallest 
number of permutations for which there is an indecomposable group is 5-4-3 when 
the number is not prime” [Galois, p. 175]. The simplicity of As is also implicit in the 
work of Ruffini and Abel on the unsolvability of the quintic equation. 

The idea of a composition series is due to Jordan. He proved that any two 
composition series have the same length and that the indices [G;_, : G;] are unique up 
to a permutation. Later, once the concept of quotient group was better understood, 
Holder proved the Jordan—H6lder Theorem mentioned above. 
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Exercises for Section 8.4 


Exercise 1. Let G be a nontrivial finite Abelian group. Prove that G is simple if and only if 
G ~ Z/pZ for some prime p. 


Exercise 2. Prove that A, is generated by 3-cycles when n > 3. 
Exercise 3, This exercise is concerned with the proof of Theorem 8.4.3. 
(a) Prove (8.17). 

(b) Verify the identities (8.18), (8.19), and (8.20). 

(c) Verify the conjugation identity (8.21). 


Exercise 4. Let H, and H2 be subgroups of a group G and assume that Hi is normal in G. 
Prove that H; M Hz is normal in A2. 


Exercise 5. Suppose that H C S, is a subgroup such that H # {e} and HMA, = {e}. Prove 
that H = {e,c}, where o is a product of an odd number of disjoint 2-cycles. 


Exercise 6. Let G be a finite group. 


(a) Among all normal subgroups of G different from G itself, pick one of maximal order and 
call it H. Prove that G/H is a simple group. 


(b) Use part (a) and complete induction on |G| to prove that G has a composition series. 


Exercise 7. Show that the Feit-Thompson Theorem (Theorem 8.1.9) is equivalent to the 
assertion that every non-Abelian finite simple group has even order. 


Exercise 8. Prove that Z/4Z and Z/2Z x Z/2Z are nonisomorphic groups with the same 
composition factors. 


8.5 SOLVING POLYNOMIALS BY RADICALS 


As in Section 8.3 we will assume the following: 
All fields appearing in this section will have characteristic 0. 
The goal of this section is to study the roots of polynomials using Galois theory. 


A. Roots and Radicals. So far, our discussion of solvability by radicals has 
focused on field extensions. We now shift our attention to polynomials and their 
roots. 


Definition 8.5.1 Let f € F [x] be nonconstant with splitting field F C L. 
(a) A root a € L of f is expressible by radicals over F if a lies in some radical 
extension of F. 


(b) The polynomial f is solvable by radicals over F if F C Lis a solvable extension. 


In Exercise | you will show that part (b) of this definition doesn’t depend on which 
splitting field of f over F we use. 

Definition 8.5.1 implies that if a nonconstant polynomial in F[x] is solvable by 
radicals, then ail of its roots are expressible by radicals. However, for an irreducible 
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polynomial, it turns out that solvability by radicals is satisfied as soon as one root is 
expressible by radicals. Here is the precise result. 


Proposition 8.5.2 Let f € F|x] be irreducible. Then f is solvable by radicals over 
F if and only if f has a root expressible by radicals over F. 


Proof: One direction is obvious. Going the other way, suppose that f has a root 
a in some radical extension of F. This means that F C F(a) is solvable, so that by 
Corollary 8.2.9, its Galois closure F C F(a) C M is also solvable. (Remember that 
we are in characteristic 0.) 

Since a Galois extension is normal and f is irreducible over F with a root in M, 
we see that f splits completely over M. Thus M contains the splitting field of f over 
F (in fact, M is the splitting field—see Exercise 2). The proposition follows, since 
F C Mis solvable. rT 


We can now apply the theory developed in Sections 8.3 and 8.4. Recall from 
Definition 6.1.12 that the Galois group of f € F[x] is Gal(L/F ), where L is a splitting 
field of f over F. Then Theorem 8.3.3 implies the following. 


Theorem 8.5.3 A polynomial f € F[x| is solvable by radicals over F if and only if 
the Galois group of f over F is solvable. rT 


We can apply this to polynomials of low degree as follows. 
Proposition 8.5.4 If f € F |x| has degree n < 4, then f is solvable by radicals. 


Proof: If f is separable, then the Galois group of f is isomorphic to a subgroup of 
S, by Proposition 6.3.1, and we are done by Theorem 8.5.3, since S,, is solvable for 
n <4. See Exercise 3 for the case when f is not separable. r | 


Once we get to degree 5, a different picture emerges. 


Example 8.5.5 In Section 6.4, we showed that f = x° —6x+3 has Ss as Galois group 
over Q. But Ss is not solvable by Theorem 8.4.5, so that f is not solvable by radicals 
over Q by Theorem 8.5.3. Furthermore, f is irreducible, so that by Proposition 8.5.2, 
no root of f is expressible by radicals over Q. <> 


This example requires that we revise how we think about the roots of a polynomial. 
Most students come into a course on Galois theory thinking that the roots of a 
polynomial f € Q[x| are numbers like 


V24+V3, V24+V2, VW1247i, etc. 


The English word “root” comes from the Latin “radix,” and the radical symbol ¥V is 
a modified version of the first letter “r” of “radix.” Historically, “root” came to refer 
to a solution of f(x) = 0 because of the intuition that roots are built from radicals. 
But the above example shows that this intuition is wrong. Roots of polynomials are 
intrinsically more complicated than just radicals. 
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B. The Universal Polynomial. The quadratic formula shows that the universal 
quadratic 


f=x-oxto= (x —x1)(x— x2) 


is solvable by radicals, and Cardan’s formulas imply that the universal cubic 
f=xP-o.+ox-03= (x — x1) (x — x2) (x — x3) 


is solvable by radicals. Furthermore, once we know these formulas in the universal 
case, then they apply to all polynomials of degree 2 and 3. 

This discussion shows that asking if the quadratic formula and Cardan’s formu- 
las generalize to polynomials of degree n is equivalent to asking if the universal 
polynomial of degree n, 


(8.23) fax" ox e+ (-1)"on = (x—x1)--- (x—-2n), 


is solvable by radicals. By Section 6.4, the splitting field of f is the universal 
extension in degree n 


K =F(o,...,0n) CL=F(m,...,%n); 


whose Galois group Gal(L/K) is isomorphic to S,. Then Theorem 8.5.3 implies 
that the existence of radical formulas generalizing the quadratic formula or Cardan’s 
formulas is equivalent to the solvability of S,. 

In particular, the solvability of S4 implies the existence of radical formulas for 
polynomials of degree 4. These are Ferrari’s formulas, to be discussed in Chapter 12. 
However, when n > 5, we have the following. 


Theorem 8.5.6 [fn > 5, then the universal polynomial fEeK [x] of degree n is not 
solvable by radicals over K, and no root of f is expressible by radicals over K. 


Proof: The first assertion follows from Theorem 8.5.3, since S, is not solvable when 
n > 5 by Theorem 8.4.5, and the second assertion follows from Proposition 8.5.2, 
since f is irreducible over K. a 


Thus, while we have the quadratic formula for polynomials of degree 2, Cardan’s 
formulas for degree 3, and Ferrari’s formulas for degree 4, it is impossible to find 
radical formulas that apply to all polynomials of degree n when n > 5. 

It is important to keep in mind that for every n > 5, there are always some 
polynomials of degree n, such as x” — 2 € Q|y], that are solvable by radicals. It is 
only when we try to solve all polynomials by radicals that we run into problems. 


C. Abelian Equations. In 1829 Abel considered separable polynomials f € F[x] 
that have a root a such that the roots of f are 0;(a),...,0,(@), where 6,,...,0, are 
rational functions with coefficients in F satisfying 
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for all i, j. Following Kronecker and Jordan, we call f = 0 an Abelian equation in 
this situation. Abel showed that Abelian equations are solvable by radicals. We can 
prove Abel’s theorem as follows. 


Theorem 8.5.7 Let f € F[x]. If f =0 is an Abelian equation, then f is solvable by 
radicals over F. 


Proof: Theorem 6.5.3 states that the Galois group of an Abelian equation is Abelian. 
(As noted in Section 6.5, this result is the origin of the term “Abelian group.”) Since 
Abelian groups are solvable by Proposition 8.1.5, we are done by Theorem 8.5.3. = 


This shows that Abel’s theorem on the solvability of Abelian equations follows 
from Galois theory and basic facts about solvable groups. 

We studied Abelian equations in the optional Section 6.5 of Chapter 6. For 
those who read that section, note that Theorem 6.5.2 is simply a restatement of 
Theorem 8.5.7 and that Theorem 6.5.4 follows from Theorem 8.5.3 because Abelian 
groups are solvable. 

For irreducible polynomials, the relation between Abelian equations and Abelian 
groups is especially nice. 


Theorem 8.5.8 Let f € F |x] be irreducible and separable of degree n with splitting 
field F CL. Then 


f =O is an Abelian equation <> Gal(L/F) is an Abelian group. 
Furthermore, when these conditions are satisfied, we have 
|Gal(L/F)| = |[L:F] =n 
and L = F(a) for any root a € L of F. 


Proof: The implication = was proved in Theorem 6.5.3. For the opposite im- 
plication, let a € L be a root of F. Then F C F(a) CL gives the subgroup 
Gal(L/F(a)) C Gal(L/F), which is normal since Gal(L/F) is Abelian. Thus 
F C F(a) is Galois and hence normal by Theorem 7.3.2. Since f is irreducible 
over F and has a root in F(a), it must split completely over F(a) by normality. It 
follows that L = F(a). This implies the final assertions of the theorem, and f = 0 is 
Abelian by Exercise 6 of Section 6.5. 7 


D. The Fundamental Theorem of Algebra Revisited. In Chapter 3 we proved 
the Fundamental Theorem of Algebra using the following two facts: 

e Every polynomial of odd degree in R[x] has a root in R (Proposition 3.2.2). 

e Every quadratic polynomial in C[x] splits completely over C (Lemma 3.2.3). 
The proof given in Section 3.2 used induction on the power of 2 in the degree of f € 
R[x]. Artin gave an elegant version of this argument using the Galois correspondence, 
the solvability of groups of prime power order (Theorem 8.1.7), and the First Sylow 
Theorem (see the Mathematical Notes to Section 8.1). Here is Artin’s proof. 
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Theorem 8.5.9 Every nonconstant polynomial in C[x| splits completely over C, i.e., 
C is algebraically closed. 


Proof: By Proposition 3.2.1, it suffices to prove that every nonconstant polynomial 
in R[x] splits completely over C. Given such a polynomial f, let R C Lbe its splitting 
field. Since R has characteristic 0, this extension is separable and hence Galois. Let 
G =Gal(L/R), and define H C G as follows: If |G| is odd, then H = {e}, and if |G| 
is even, then H is a 2-Sylow subgroup of G. Hence H is a subgroup of G such that 
|| is the highest power of 2 dividing |G]. 

By the Galois correspondence, the fixed field R C Ly has degree [Ly : R] = 
[G: H] = |G|/|H|. This is odd by the definition of H, so that R C Ly has odd degree. 
It follows that if a € Ly is a primitive element over R, then the minimal polynomial 
g € R[x] of a has odd degree. But by the first bullet above, g has a root in R. Since 
minimal polynomials are irreducible, this means that g must have degree 1, which 
implies Ly = R. 

Then the Galois correspondence implies H = G, so that |G| is a power of 2, say 
|G| = 2". If n =0, then G is trivial, which implies that L = R. Hence f splits 
completely over R in this case. Now suppose that n > 1. By Theorem 8.1.7, G is 
solvable, which by |G| = 2” and Definition 8.1.1 means that we have subgroups 


{e} =G, CGr-1 C++ CG, CGN=G 
such that G; is normal in G;_, of index 2 for 1 <i <n. This gives the fixed fields 
R=La C Le, C Le, Cee 


such that Lg,_, C Le, has degree 2 for every i. 

Since n > 1, we have the degree 2 extension R C Lg,. The minimal polynomial 
of a primitive element of this extension is a quadratic polynomial with no real roots. 
It follows easily that Lg, ~ C. 

Now suppose that n > 2. Since Lg, C Lg,, we have a degree 2 extension of C. By 
the second bullet above, this is impossible, since every quadratic polynomial in C[x] 
splits completely over C. Hence we must have n = 1, which implies that |G| = 2 and 
L=Lg, ~C. It follows that f splits completely over C, as claimed. a 


Notice that our proof of Theorem 8.5.9 translates the above two bullets into the 
following field-theoretic facts about R and C: 


e Rhas no extensions of odd degree > 1. 
« Chas no extensions of degree 2. 


The essence of Artin’s argument is that these facts combine with Galois theory and 
results from group theory (the First Sylow Theorem and the solvability of groups of 
order 2”) to prove the Fundamental Theorem of Algebra. 


Historical Notes 


The universal polynomial f considered in (8.23) is sometimes called the “general 
polynomial” of degree n. We will see in Chapter 12 that Lagrange tried hard to solve 
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the general quintic in 1770. Using these methods, Ruffini proved the impossibility 
of solving the general quintic by radicals in 1799, though his proof was difficult to 
follow (see [2]). In 1824, Abel, also using the methods of Lagrange, found a proof 
that came to be more generally accepted. The general quintic is discussed in [3], [8], 
and [11]. Also, one can prove the unsolvability of the general equation of degree 
n> 5 by radicals (Theorem 8.5.6) without using Galois theory (see [1]). See [9] for 
an account of Abel’s proof and [10] for more on his life. 

A tantalizing comment in [Galois, p. 72] suggests that Galois may have known the 
First Sylow Theorem. We don’t know whether he had a proof or simply conjectured 
the result. 

However, there is no doubt that Galois knew a Jot about solvability by radicals. 
Chapter 14 will explore Galois’s amazing insights about when an irreducible poly- 
nomial of degree p or p’, p prime, is solvable by radicals. 


Exercises for Section 8.5 


Exercise 1. Let F C L; and F C Ly be splitting fields of f € F[x]. Prove that F C Ly is 
solvable if and only if F C Lz is solvable. 


Exercise 2. Let f € Fx] be separable and irreducible, and assume that we have an extension 
F C F(a) where @ is a root of f. Prove that the Galois closure of this extension (as defined 
in Section 7.1) is the splitting field of f over F. 


Exercise 3. Let F have characteristic 0 and suppose that f € F [x] has degree < 4 and is not 
separable. Prove that f is solvable by radicals over F. 


Exercise 4. Let f be the minimal polynomial of ¥/~/17+ 37 over Q, where all of the 
indicated radicals are real. Prove that f is solvable by radicals over Q. 


Exercise 5. Let F have characteristic 0, and assume that we have fields F C K CL. Also 
suppose that a € L is expressible by radicals over K and that the extension F C K is a solvable 
extension. Prove carefully that the minimal polynomial of a over F is solvable by radicals 
over F. 


Exercise 6. The proof of Theorem 8.5.9 used the Theorem of the Primitive Element to show 
that R has no extensions of odd degree > 1. Prove this without using primitive elements. 


8.6 THE CASUS IRREDUCIBILIS (OPTIONAL) 


In this optional section we will complete our discussion of the casus irreducibilis 
begun in Chapter 1. We will also give an example to show how solvability by radicals 
can fail in characteristic p. 


A. Real Radicals. By Section 1.3, a monic separable cubic polynomial f € R[x] 
with real roots has discriminant A(f) > 0. Then Cardan’s formulas (8.15) imply that 
the complex number 


-A(f) _ . (ACA) 


=1 


27 27 
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appears in the formulas for the roots of f, even though the roots are real. 

It is natural to ask whether it is possible to express the roots of f in terms of real 
radicals in this situation. In some cases, such as f = x3 +x? — 5x —5, the answer is 
yes, since f = (x + 1)(x? — 5) has roots —1,+/5, which are expressible using real 
radicals. However, we will show below that the answer is no whenever the cubic f 
is irreducible. This is the casus irreducibilis from Section 1.3. 

We first give a careful definition of what it means for a real number to be expressible 
by real radicals. 


Definition 8.6.1 Let F be a subfield of R. Then: 
(a) F C Lis areal radical extension if F C L is radicaland LCR. 


(b) a € R is expressible by real radicals over F if there is a real radical extension 
F CL such that a € L. 


Before proving our main theorem, we need to study radical extensions. Our first 
result allows us to limit ourselves to prime radicals. 


Lemma 8.6.2 If F C K is a radical extension, then there are fields 
F=FroCF,cC-:-Ch-|Ch,=K 


where fori =1,...,n, there is y; € F; such that F; = F;_.\(+;) and y;"' € Fi_\ for some 
prime m,. 


Proof: We first show that the lemma is true for an extension F C F(y) with y” € F 
for some m > |. If m is prime, then we are done, and if m is not prime, then let p be 
a prime dividing m and set 6 = -y?. This gives extensions 


F C F(6) C F(5)(y) = F(7) 


such that 7? = 5 € F(d) and 6"/? = y" € F. If m/p is prime, then we are done, and 
if not, pick a prime dividing m/p and continue as above. Thus the lemma holds for 
F C F(7y). Since any radical extension is a sequence of such extensions, the lemma 
follows. a 


We next study extensions obtained by adjoining real prime radicals. 


Lemma 8.6.3 Let E be a subfield of R, and suppose that y € R satisfies y ¢ E and 
+ € E, where mis prime. Then g =x" —~y" is irreducible over E, and |E(y):E] =m. 


Proof: By Proposition 4.2.6, it suffices to show that g has no roots in FE. If B€ E 
is a root of g, then @” = y”, so that 6 = Cy for some mth root of unity ¢. It follows 
that @ = +4 since f and ¥ are real and nonzero and the only real roots of unity are 
+1. Thus y= + € E. This contradiction proves the lemma. | 


The following result will be a key tool in our analysis of the casus irreducibilis. 


Proposition 8.6.4 Suppose that M C Lis a Galois extension with LC Rand |L:M]= 
Pp for an odd prime p. Then L cannot lie in a real radical extension of M. 
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Proof: Suppose that we have an extension 
Mc M(y) 


where y ¢ M, 7 € R, and y” € M for some prime m. Then M Cc R implies that 
[M(y) :M] = mby Lemma 8.6.3. We will relate [L(y) : M(-y)] to [L: M] by considering 
the following diagram: 


(8.24) L M(y) 


If + € L, then M(y) = L, since y ¢ M and [L: M] is prime. It follows that 
(8.25) m= (M(y):M] =[L:M] =p, 


so that m is odd. Furthermore, Lemma 8.6.3 implies that x” — -y™ is the minimal 
polynomial of 7 over M. Since M c Lis normal, x” — -y™ splits completely over L. 
The roots of this polynomial are ¢£y for = 0,...,m—1, where ¢,, = e?7!/". Since 
#0, it follows that ¢,, € L. This is impossible, since m is odd andL Cc R. 

Hence y ¢ L, so that (L(y): L] = m by Lemma 8.6.3. Using m= [M(y): M] and 
the Tower Theorem, (8.24) easily implies that 


(8.26) (L(y): M(y)] = [E:M] = p. 


Thus adjoining a real prime radical doesn’t change the degree. 

By Lemma 8.6.2, a real radical extension M C K is obtained by adjoining suc- 
cessive real prime radicals. Each time we do this, (8.26) shows that the degree is 
unchanged. In Exercise 1 you will use this to prove that 


(8.27) [KL: K] =[L:M] = p, 


where KL is the compositum of K and L. It follows that KL # K, which in turn implies 
that L ¢ K (see Exercise 1). Since M C K is an arbitrary real radical extension of M, 
we conclude that ZL cannot lie in such an extension, as claimed. a 


B. irreducible Polynomials with Real Radical Roots. We can now state 
a generalized version of the casus irreducibilis proved by Holder in 1891 [6] and 
independently by Isaacs in 1985 [7]. Their result shows that when an irreducible 
polynomial has all real roots, the roots are expressible by real radicals only in very 
special cases. 


Theorem 8.6.5 Let F be a subfield of R and let f € F(x] be irreducible with splitting 
field F CLCR. Then the following conditions are equivalent: 
(a) Some root of f is expressible by real radicals over F. 
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(b) All roots of f are expressible by real radicals over F in which only square roots 
appear. 


(c) F C Lis a radical extension. 
(d) [L: F] is a power of 2. 


Proof: Some implications of the proof are easy. For example, (b) > (a) is trivial 
and (c) => (a) follows from L Cc R. 

Now suppose that (d) holds. This implies that |Gal(L/F)| is a power of 2. As in 
the proof of Theorem 8.5.9, this leads to subfields 


F=Lg Cle, C:::ClLle,_, CLe, =L 


where each field has degree 2 over the previous. Since the characteristic is different 
from 2, each field is obtained from the previous by adjoining a square root. This 
shows that F C Lis radical, so that (d) > (c) follows. We also obtain (d) => (b) since 
LCR. 

It remains to prove (a) = (d). We have f € F[x] with splitting field F CLC R. 
Now assume that some root @ of f lies in a real radical extension F C K and that 
[L: F] is not a power of 2. Our goal is to derive a contradiction. 

We will use a clever idea from [7] to reduce to the situation of Proposition 8.6.4. 
Let p be an odd prime dividing [L: F]. We claim the following: 


(8.28) There is 0 € Gal(L/F) of order p such that o(a@) # a. 


Let us first use (8.28) to get our desired contradiction. Given o as in (8.28), let 
M = Lig) be the fixed field of the cyclic group generated by o. Then the Galois 
correspondence implies that M C L is a Galois extension such that 


(0)| =p. 


By Proposition 8.6.4, L lies in no real radical extension of M. 

On the other hand, a € L because L is the splitting field of f, yet a ¢ M because 
a(a) # a. Since [L: M] is prime, L = M(q) follows. Furthermore, we are assuming 
that a € K, where F C K is real radical. Hence: 

e L=M(a) C MK. 
e MC MK is real radical, since F C K is (see Exercise 2). 


[LM] = |Gal(L/M)| = |Gal(L/L;.))] = 


These bullets imply that L lies in a real radical extension of M, which contradicts the 
previous paragraph. 

It remains to prove (8.28). Since p divides [L: F] = |Gal({L/F)|, Cauchy’s The- 
orem implies that Gal(L/F) has an element 7 of order p. Let the roots of f be 
Q) = Q,...,Qm, m= deg(f). Using L = F(a),...,@m) and + # 12, we see that 
7(a;) 4 a; for some i. However, f is irreducible, so that by Proposition 5.1.8, there 
is 0; € Gal(L/F) such that o;(a@) = a;. Then o; '70; has order p and 


a; 'toi(a) =0;'7(ai) #07 (a) =a. 
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It follows easily that o = 0; ‘+0; satisfies the conditions of (8.28). The proof of 
theorem is complete. 7 


Theorem 8.6.5 has the following useful corollary. 


Corollary 8.6.6 Let F be a subfield of R, and assume that f € F |x] is irreducible 
and deg(f) is not a power of 2. If f splits completely over R, then no root of f is 
expressible by real radicals over F. 


Proof: Let F CL CR be the splitting field of f over F, and let a € L be a root of 
f. Then the extensions 
FCF(a)cL 


and the Tower Theorem imply that [L: F] is a multiple of deg(f). Since deg(f) is 
not a power of 2, the same is true for [L: F]. Then the corollary follows from the 
equivalence (a) = (d) of Theorem 8.6.5. rT] 


In concrete terms, Corollary 8.6.6 says that if a polynomial with real roots is 
irreducible over a subfield F C R and has degree not a power of 2, then it is impossible 
to express any of its roots using real radicals over F. In particular, this is true for any 
irreducible cubic with real roots, which is the casus irreducibilis. 

Here is an example of Theorem 8.6.5. 


Example 8.6.7 Consider the polynomial 
fax —4x? 4x41. 


By Exercise 3, f is irreducible over Q and all of its roots are real. In Chapter 13 we 
will show that the Galois group of f over Q is isomorphic to S4. If L is the splitting 
field of f over Q, it follows that [L:Q] = 24. This is not a power of 2, so that by 
Theorem 8.6.5, no root of f is expressible in terms of real radicals. Yet f is solvable 
by radicals since S4 is solvable. <P 


We can use Corollary 8.6.6 to construct solvable extensions that are not radical. 


Example 8.6.8 Consider f = x? +x? —2x—1. In Example 8.2.3, we showed that 
the splitting field Q C L of f is solvable but not radical. This follows immediately 
from Corollary 8.6.6 since f is irreducible of degree 3 with all real roots. <p> 


We will also see that Theorem 8.6.5 has applications to the geometric constructions 
considered in the Mathematical Notes to Section 10.1. 


C. The Failure of Solvability in Characteristic p. One surprise is that the 
methods used to study real radicals can also shed light on solvability by radicals for 
fields of characteristic p. When we considered solvable extensions in Sections 8.3 
and 8.5, we explicitly assumed that we were working in characteristic 0. As we will 
see, this is necessary because of the lack of pth roots of unity in characteristic p. 

We begin with a result on adjoining prime radicals. We say that a field F contains 
all roots of unity if x” — 1 splits completely over F for all integers m > 1. 
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Lemma 8.6.9 Let F be a field that contains all roots of unity. Also assume that F 
has an extension containing an element y such that -y j F but a € F for some prime 
m. Then g=x™—v-y" is irreducible over F and [F(y):F]= 


Proof: If g has aroot 8 € F, then 6” = -y”, which implies that 8 = ¢y for some 
mth root of unity ¢. Then ¢, 6 € F imply that -y € F, which is a contradiction. Hence 
g has no roots in F and the lemma follows from Proposition 4.2.6. a 


We now state our main result. 


Proposition 8.6.10 Let M be a field of characteristic p that contains all roots of 
unity. Then any Galois extension M C L of degree p is not solvable. 


Proof: We need to prove that L cannot lie in a radical extension of M. The argument 
is remarkably similar to the proof of Proposition 8.6.4. Consider an extension 


MCM(y) 


such that y ¢ M andy” € M for some prime m. Lemma 8.6.9 implies that g = x” —y™ 
is irreducible over M and [M(y):M] = 

Following the proof of Propvattion 3.6.4, we have the diagram (8.24). If y € L, 
then M() = L since [L:M] is prime. Using (8.25), we see that m = p. Then the 
minimal polynomial of y over M is g = x? —-y?. However, M C L is Galois and 
hence separable, so that -y would be separable over M. Yet g = x? — y? = (x— 7)? is 
clearly inseparable. 

This contradiction shows that -y ¢ L, so that [L(y) : L] = m by Lemma 8.6.9. Since 
[M(7) :M] =m, the Tower Theorem and (8.24) imply that 


[L(y) :M(y)] = [L:M] = 


Thus adjoining a prime radical doesn’t change the degree. Also, the extension 
M(y) C L(y) is Galois (you will prove this in Exercise 4), and M(y) contains all 
roots of unity because M does. 

From here, it is straightforward to show that L lies in no radical extension of M. 
We leave the details as Exercise 4. a 


We construct an extension M C L that satisfies Proposition 8.6.10 as follows. 


Example 8.6.11 Let & be an algebraically closed field of characteristic p (the ex- 
istence of such a field is proved in [Jacobson, Vol. II, Sec. 8.1]), and let M = k(t), 
where t is a variable. Since x” — 1 € k[x] splits completely over k for any m, it follows 
that M contains all roots of unity. 

Following Artin and Schreier, we consider the polynomial 


f =x? —x+te Mix]. 


Let L be a splitting field of f over M. We know that M C L is a Galois extension by 
Exercise 15 of Section 5.3. In Exercise 5 you will show that there is a one-to-one 
group homomorphism 


(8.29) Gal(L/M) —> Z/pZ. 


226 SOLVABILITY BY RADICALS 


Since [L: M] = |Gal(L/M)|, it follows that [L: M] = 1 or p. The former would imply 
that L = M, which would mean that f splits completely over 47. However, Exercise 6 
will show that f has no roots in M. Hence [L: M] = p. 

It follows that M C Lis a Galois extension of degree p. Since M contains all roots 
of unity, Proposition 8.6.10 implies that M C L is not solvable. <p> 


In Example 8.6.11, note that Gal(L/M) ~ Z/pZ since M C Lis a Galois extension 
of degree p. So the Galois group is Abelian and hence solvable, yet the extension is 
not solvable. This shows that the relation between solvable extensions and solvable 
Galois groups breaks down in characteristic p. 

To see where the problem is, note that a key step in the proof of Proposition 8.6.10 
is the observation that in characteristic p, the polynomial g = x? — y? = (x—y)? is 
not separable and hence ¥ can’t lie in a nontrivial Galois extension. The inseparability 
of g can be explained by the small number of pth roots of unity in characteristic p, 
for any two roots of g = x? — -y? differ by a pth root of unity, but x? — 1 = (x— 1)? 
implies that the only pth root of unity is 1. 

It turns out that if one avoids extensions whose degree is divisible by p, then the 
relation between solvable extensions and solvable Galois groups works out nicely in 
characteristic p. See Exercise 7 for a proof. 


Historical Notes 


Although the casus irreducibilis for the cubic equation dates back to the sixteenth 
century, rigorous proofs didn’t appear until late in the nineteenth century. Mollame 
gave a proof in 1890, followed a year later by the more general result of Holder 
proved in the text. A quick proof for cubics can be found in [van der Waerden]. 

We should also mention the following result of Loewy from the 1920s. 


Theorem 8.6.12 If F C R and f € F[x] is irreducible of degree 2"n, n odd, then f 
has at most 2” roots expressible by real radicals over F. 7 


When f is irreducible of odd degree, Theorem 8.6.12 implies that at most one 
root can be expressible by real radicals. For cubics, this is consistent with Cardan’s 
formulas (see Example 1.1.1). References and a proof of Theorem 8.6.12 when the 
degree is odd can be found in [Chebotarev, p. 350]. 


Exercises for Section 8.6 


Exercise 1. Here are some details from the proof of Proposition 8.6.4. 
(a) Prove (8.27). 
(b) Prove that KL = K if and only if Lc K. 


Exercise 2. Let F C K be a real radical extension and that F C M C R. Prove that M C MK 
is a real radical extension. 


Exercise 3. Show that the polynomial f = x* — 4x? +x+ 1 of Example 8.6.7 is irreducible 
over Q and has four real roots. 
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Exercise 4. Complete the proof of Proposition 8.6.10. 


Exercise 5. This exercise will consider the polynomial f = x? —x++ from Example 8.6.11. 
Let a € Lbe a root of f. 
(a) Show that the roots of f area,a+1,...,at+p—1. 
(b) Let o € Gal(L/M). By part (a), o(a) = a +i for some i. Prove that o +> [i] gives the 
desired one-to-one homomorphism (8.29). 


Exercise 6. Let k be a field and let M = k(t), where t is a variable. The goal of this exercise 
is to prove that if n > 1, then there is no element 6 € M such that 8B” -— B+r=0. 
(a) Write 8 = A/B, where A,B € k[t] are relatively prime polynomials. Prove that 6” — 8+ 
t = 0 implies that B|A and hence that B is constant. 
(b) Show that A” —A ++ #0 for all polynomials A € &[f]. 
Exercise 7. Suppose the F is a field of characteristic p and that F C L is a Galois extension. 
Also assume that Gal(L/F) is solvable and that p{|[L: F]. Prove that F C Lis solvable. 
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CHAPTER 9 


CYCLOTOMIC EXTENSIONS 


In this chapter we will explore the Galois theory of cyclotomic extensions, which are 
extensions of the form Q ¢ Q(¢,), ¢, =e?"!/". This will involve a study of cyclotomic 
polynomials and Gauss’s theory of periods. In the next chapter we will apply these 
results to determine which regular polygons are constructible by straightedge and 
compass. 


9.1 CYCLOTOMIC POLYNOMIALS 
In Section 4.2 we showed that if p is prime, then 
@ (x) =x?! xP 24. +x41 


is the minimal polynomial of ¢, = e’7'/P over Q. In this section, we will describe the 
minimal polynomial of 
G, = e2ri/n 


over Q, where n is now an arbitrary integer > 1. We will also compute the Galois 
group Gal(Q(¢,,)/Q). But first, we need two facts from elementary number theory. 
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A. Some Number Theory. We begin with the Euler ¢-function. Given a positive 
integer n, we define ¢(n) to be the number of integers i such that 0 <i <n and 
gcd(i,n) = 1. We can interpret ¢(n) in terms of the ring Z/nZ as follows. The 
invertible elements of this ring form the set 


(Z/nZ)* = {[i] € Z/nZ | [i][j] = [1] for some [j] € Z/nZ}. 


One easily sees that (Z/nZ)* is a group under multiplication. In Exercise 1 you will 
show that (Z/nZ)* has order $(n). In other words, 


(9.1) b(n) = |(Z/nZ)"|. 
Our first lemma gives the basic properties of the ¢-function. 


Lemma 9.1.1 Let ¢ be defined as above. 
(a) Ifn and mare relatively prime positive integers, then o(nm) = $(n)¢(m). 
(b) Ifn > 1 is an integer, then 


1 
on) =n] ] (1-—), 


where the product is over all primes p dividing n. 


Proof: Since gcd(n,m) = 1, Lemma A.5.2 implies that there is a ring isomorphism 
a: Z/nmZ ~ Z/nZ x Z/mZ. In Exercise 2 you will show that a induces a group 
isomorphism 

(Z/nmZ)* ~ (Z/nZ)* x (Z/mZ)". 


Then @(nm) = $(n)¢(m) follows immediately from (9.1). 
Next observe that if p is prime and a > 1, then ¢(p*) counts the number of integers 
i such that 0 <i < p* and p{i. In other words, if 


S={jeZ|0<j< p*and pi j}, 
then ¢(p*) = p* ~ |S|. However, p|j for some 0 < j < p® if and only if j = pé for 
some 0 < ¢ < p*~!. Thus |5| = p?—!, so that 6(p*) = p? — p?-!. 


For arbitrary n > 1, write n = p{' --- p%, where the p; are distinct primes and a; > 1 
for all i. Using part (a) and the formula ¢(p*) = p? — p*—!, we obtain 


O(n) = (ph ++ pe) = o(p)-- O(p%) = (Pt = pt!) ++ (pe — pe") 
1 1 1 
= p@ 1-—)- a, 1-—)=n 1--), 
Pl ( pid ( Ps II( P 
This completes the proof. a 


Our second lemma is sometimes called Fermat’s Little Theorem. 
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Lemma 9.1.2 /f p is prime, then a? =a mod p for all integers a. 


Proof: Since the congruence is true when pla, we may assume that pta. Then 
[a] € (Z/pZ)*, so that [a]?—! = [1], since (Z/pZ)* is a group of order p — 1 under 
multiplication. In congruence notation, this means that a’-! =] mod p. The desired 
congruence follows by multiplying each side by a. . 


B. Definition of Cyclotomic Polynomials. Our next task is to define the 
cyclotomic polynomial ©, (x) for n > 1 and show that it has integer coefficients. We 
begin with the factorization 


(9.2) x"-1= [J («-¢). 


O<i<n 


Then define the nth cyclotomic polynomial ®,(x) to be the product 


(9.3) (x)= |] (eG). 
O<i<n 
gcd(i,n)=1 
Thus the roots of ®,,(x) are ¢! for those 0 < i < n relatively prime to n. It follows 
that ®, (x) has degree ¢(n). Combining this with (9.1), we see that 


(n) = deg (®,(x)) = |(Z/nZ)"|. 


This link between ©, (x) and (Z/nZ)* will be used to determine Gal(Q(¢,)/Q). 

In Section 8.3 we defined a root of x" — 1 to be a primitive nth root of unity if its 
powers give all roots of x” — 1. In Exercise 3 you will prove that in our situation, the 
primitive nth roots of unity are ¢! for 0 < i <n and gced(i,n) = 1. Thus the roots of 
®,,(x) are the primitive nth roots of unity in C. 

Here are some examples of cyclotomic polynomials. 


Example 9.1.3 When n = 2, the only primitive square root of unity is —1, so that 
2(x) =x+1. When n =4, the primitive fourth roots of unity are i and ? = —i, so 
that 

4(x) = (x-i)(x+i) =x? +1. 


Since ®;(x) = x — 1, we get the factorization 
x*—1 = (x—1)(x+1)(x2 +1) = ©) (x) ®2 (x) B4(x). 


Proposition 9.1.5 will show that x” — 1 has a similar factorization. <p 


Example 9.1.4 Let p be prime. Since 1,..., »— 1 are relatively prime to p, it follows 


that Pl 
xP — 
p(x) = (x-C,)(a-G)- -G') = x-1/ 


Using x? — 1 = (x—1)(xP-1+---+x+41), we obtain ®,(x) =x?-1+---+x+41, 
which agrees with the definition of ®,(x) given in Section 4.2. <P 


232 CYCLOTOMIC EXTENSIONS 


In the following discussion we will write d|n to indicate that d is a positive divisor 
of n. We now state some elementary properties of cyclotomic polynomials. 


Proposition 9.1.5 ©,(x) is a@ monic polynomial with integer coefficients and has 
degree (n). Furthermore, these polynomials satisfy the identity 


(9.4) x"-1=][4u(2). 


d|n 


Proof: ©,(x) is monic by definition and has degree ¢(n) as shown above. Next 
we prove the factorization (9.4). The basic idea is that every number i in the range 
0 <i<n gives a divisor d = gcd(i,n) of n. Since different values of i can give the 
same d, we can organize the factorization (9.2) according to d. This gives 


x"-1=]] [] @-¢). 
dln OSi<n 
gcd(i,n)=d 


For a fixed positive divisor d of n, the corresponding part of this factorization is 


(9.5) I] «-©). 
O<i<n 
ged(i,n)=d 
But gcd(i,n) =d implies that i = dj andn = d5, where ged(j, 7) = 1. Also: 
e 0 <i <n becomes 0 < dj < d4, which is equivalent to0 < j < 5. 


° ¢¢ = ¢2, 80 that x— ¢! =x—CN =x- (a), 


It follows that (9.5) can be written 


0<j<4 
ecd(j,5)=1 


which by (9.3) is the cyclotomic polynomial ©: (x). Thus the above factorization of 


x” — 1 becomes 
x"-l= [[e:@. 
d|n 
Then (9.4) follows since d is a positive divisor of n if and only if 5 is. 
It remains to show that ®,,(x) has integer coefficients. We prove this by complete 
induction on n. The base case n = | is trivial, since ®;(x) =x — 1. Furthermore, if 
n> I, then (9.4) and our inductive hypothesis imply that 


x"-1=6,(x)- [[ (x) 


d|n,d<n 


= ©,,(x)-a monic polynomial g(x) with integer coefficients. 


CYCLOTOMIC POLYNOMIALS 233 


Hence ©,,(x) is the quotient of x" —1 by g(x). Since x" —1 and g(x) lie in Z[x] 
and g(x) is monic, the refinement of the division algorithm presented in Exercise 4 
implies that ®,(x) € Z|x]. This completes the proof. . 


Here are some examples of how to use the identity (9.4). 
Example 9.1.6 Let p be prime. Proposition 9.1.5 implies that 
x? —~1=1(x)®,(x) and x” —1 = 6)(x)®,(x)®,2(x). 


Thus ; 
xP —1}= (x? —1)®,(x). 
It follows that 


2 
xP —] 


G ,2(x) = xP] 


= xP(P—1) 4 P(P—2) 4 4 PP +, 


where the second equality follows from 


xP? — | 


5 = xP Pp tar txt 
x—- 


by replacing x with x?. <I 


Example 9.1.7 In the examples of cyclotomic polynomials given so far, the coeffi- 
cients are always 0 or +1. This is true for all n < 105. You will show in Exercise 5 
that ®)95 (x) is the polynomial 


btext? — x5 — 6 2x7 — x8 — 9 p24 4 yl 4 yl 
el 4 tT 20 422 24 p26 284 314 324 334 4 


35 4 36 — 539 40 94 A284 46 4 47 4 8, 
As n increases, the coefficients of ®,(x) can get arbitrarily large (see [1] and [4]).<> 


C. The Galois Group of a Cyclotomic Extension. The first step in computing 
Gal(Q(¢,,) /Q) is to prove that ®,(x) is irreducible. For this, we need the following 
application of symmetric polynomials and Lemma 9.1.2. 


Lemma 9.1.8 Let f € Z[x] be monic of positive degree, and let p be prime. If fp is 
the monic polynomial whose roots are the pth powers of the roots of f, then: 

(a) fp € ZIx). 

(b) The coefficients of f and f, are congruent modulo p. 


Proof: If f has roots y,,...,7,, 7 = deg(f), then 


r 


fox) = [[ 9?) =x" ona 2P et + (HD) or (of, P)- 


tu 
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Similarly, f(x) = x" —o1(y1,.--,¥-)x7' ++++ + (-1)"o-+(91,.--, 7%). In these for- 
mulas, o1,...,0, are the elementary symmetric polynomials from Chapter 2. 

Observe that o;(x?,...,x?) is a symmetric polynomial. In Exercise 6 you will 
show that the algorithm of Theorem 2.2.2 implies that 


(9.6) oj(xf,...,x?) =o? + S(a1,...,07), 


where S(o),...,0,) is a polynomial in o),...,0, with integer coefficients. However, 
if we reduce modulo p, then Lemma 5.3.10 implies that 


oP = itis oor)” = one oo oXP) 


as polynomials with coefficients in F, (see Exercise 6 for details). Combining this 
with (9.6), we see that the coefficients of S(o),...,0,) are all divisible by p. 

Now substitute y,,...,7- for x,,...,x, in (9.6). Since oi(y1,...,y-) € Z for all 
i and S has integer coefficients, we conclude that o;(y?,...,7?) € Z. Since the 
coefficients of S are all divisible by p, we also have 


OY + YP) = (M15 ++ Wr)? = i(M1, +++ Ir) mod p, 
where the second congruence follows from Lemma 9.1.2. Thus the coefficients of f 
and f, are congruent modulo p. rT] 
We now show that ©®,,(x) is the minimal polynomial of ¢, over Q. 


Theorem 9.1.9 The cyclotomic polynomial ®,,(x) is irreducible over Q. 


Proof: Let f € Q[x] be an irreducible factor of ®, (x). Then Gauss’s Lemma, in the 
form of Corollary 4.2.1, allows us to assume that f € Z[x] and that 


(9.7) ©, (x) = f(x)g(x), 


for some g € Z|x]. We can also assume that f and g are monic, since ®,,(x) is. 
Let p be a prime not dividing n. The first step in the proof is to show that 


(9.8) If ¢ is a root of f, then so is ¢?. 


We will prove (9.8) by contradiction, so suppose that f(¢) = 0 and f(¢?) £0. 

As in Lemma 9.1.8, let fp € Z[x] be the monic polynomial whose roots are the 
pth powers of the roots of f. In Exercise 7 you will show that the roots of f, are 
distinct primitive nth roots of unity, which implies that f, divides ®,(x). If f and f, 
had a common root, then f would divide f,, since f is irreducible. This would force 
f = fp, since they are monic of the same degree. But f = f, is impossible, since 
f(¢?) 4 O and f,(¢?) = 0 (the latter follows from f(¢) = 0 by the definition of f,). 
Thus they have no common roots, so that (9.7) can be written 


On (x) = f(x) fp(x) h(a). 


Since ®,(x), f(x), and f,(x) are monic with integer coefficients, the refined division 
algorithm of Exercise 4 implies that the same is true for A(x). 
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Consider the map sending (x) € Z[x] to the polynomial q(x) € F,[x] obtained 
by reducing the coefficients of q(x ) modulo p. Since 4 (x) = = f(x x) by Lemma 9.1.8, 
the above factorization implies that f*(x) divides ®,(x) in F,[x]. Thus f?(x) divides 
x" — 1, so that x” — 1] is not separable in F,[x}. But x” — 1 is separable, since p{n. 
This contradiction completes the proof of (9.8). 

Now let ¢ be a fixed root of f and let ¢’ be any primitive nth root of unity. In 
Exercise 7 you will show that ¢’ = ¢/ for some j relatively prime ton. Let j = pi --: py 
be the prime factorization of j, and note that each p; is relatively prime to n. Then 
successive application of (9.8) shows that 


¢,¢?! ,CPIP2 CPLP2P3 pene CPV Pr _ ‘a 


are roots of f. Hence every primitive nth root of unity is a root of f. Since f divides 
®,,(x), we conclude that f = ®,(x). Thus ®,,(x) is irreducible, since f is. . 


Theorem 9.1.9 implies that ®,(x) is the minimal polynomial of ¢, over Q. Thus 
[Q(¢,,) :Q] = deg(®,(x)) = ¢(n), which proves the following corollary. 


Corollary 9.1.10 [Q(¢,):Q] = d(n). 7 
This makes it easy to compute the Galois group of a cyclotomic extension. 


Theorem 9.1.11 There is an isomorphism Gal(Q(¢,,)/Q) ~ (Z/nZ)* such that o € 
Gal(Q(¢,)/Q) maps to [é] € (Z/nZ)* if and only if o(G,) = G- 


Proof: We know from (8.6) that Q Cc Q(¢,) is a Galois extension. Furthermore, 
an element o € Gal(Q(¢,)/Q) is uniquely determined by o(¢,,), which is a root of 
®, (x) because ¢, is. Thus o(¢,) = Cf for some £ relatively prime ton. By Exercise 4 
of Section 6.2, the map o +> [£] is a well-defined one-to-one group homomorphism 
Gal(Q(¢,,)/Q) — (Z/nZ)*. Then Corollary 9.1.10 implies that 


IGal(Q(¢,)/Q)| = [Q(¢,) :Q] = o(") = |(Z/nZ)"]. 
It follows that Gal(Q(¢,,)/Q) — (Z/nZ)* is an isomorphism. 7 


In the next chapter we will use Corollary 9.1.10 to characterize those n for which 
a regular polygon with n sides is constructible by straightedge and compass. 


Historical Notes 


While both Lagrange and Vandermonde made significant use of roots of unity, 
the first systematic study of cyclotomic extensions is due to Gauss. Most of Gauss’s 
results appear in Section VII of Disquisitiones Arithmeticae [5], published in 1801. 
This amazing book covers a wide range of topics in number theory. In particular, 
Gauss introduces the congruence notation a = b mod n and proves a version of 
Gauss’s Lemma (Theorem A.3.2). 

In Section VII Gauss studies the extension Q C Q(¢,,), where p is prime. As we 
will see in the next section, Gauss constructs primitive elements for all intermediate 
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fields and essentially describes the Galois correspondence. In Article 365 of [5] 
he applies his results to the constructibility of regular polygons by straightedge and 
compass. We will discuss this in the next chapter. 

To study Q C Q(¢,), Gauss needed to know that ,(x) = xP! 4...41 is irre- 
ducible over Q. Not surprisingly, he proves this using Gauss’s Lemma. For general 


n> 1, the entry dated June 12, 1808 of Gauss’s mathematical diary (see [6]) reads as 
follows: 


The equation. . . that contains all primitive roots of the equation x” — 1 = Ocannot 
be decomposed into factors with rational coefficients, proved for composite 
values of n. 


Unfortunately, Gauss’s proof has been lost. The first published proof that ®,(x) is 
irreducible (Theorem 9.1.9) appeared in 1854 and is due to Kronecker. Our proof is 
based on arguments of Dedekind, as presented by Jordan in 1870. The key step is 
(9.8), which we proved using Lemma 9.1.8. Sch6nemann’s proof of this lemma dates 
from 1846, though Gauss proved it much earlier in an unpublished continuation of 
[5]. A modern proof of (9.8) is sketched in Exercise 8. 


Exercises for Section 9.1 


Exercise 1. Prove that a congruence class [i] € Z/nZ has a multiplicative inverse if and only 
if ged(i,n) = 1. Conclude that (Z/nZ)* has order ¢(n). Be sure that you understand what 
happens when n = 1. 


Exercise 2, Assume that gcd(n,m) = 1. By Lemma A.5.2, we have a ring isomorphism 
a: Z/nmZ ~ Z/nZ x Z/mZ that sends [alam to ([a]n,[a]m). Prove that a induces a group 
isomorphism (Z/nmZ)* ~ (Z/nZ)* x (Z/mZ)*. 


Exercise 3. Let ¢, = e?"!/" € C. Prove that Ci for 0 < i < nand gcd(i,n) = | are the primitive 
nth roots of unity in C. 


Exercise 4. Let R be an integral domain, and let f,g € R[x], where f #0. If K is the 
field of fractions of R, then we can divide g by f in K[x] using the division algorithm of 
Theorem A.1.14. This gives g = gf +7, though g,r € K[x] need not lie in R[x]. 
(a) Show that dividing x? by 2x+ 1 in Q[x] gives x” = q-(2x+ 1) +1, where g,r € Q[x] are 
not in Z[x], even though x? and 2x+ 1 lie in Z[x]. 
(b) Show that if f is monic, then the division algorithm gives g = gf +r, where q,r € R[x]. 
Hence the division algorithm works over R provided we divide by monic polynomials. 


Exercise 5. Verify the formula for ®105(x) given in Example 9.1.7. 


Exercise 6. This exercise is concerned with the proof of Lemma 9.1.8. 
(a) Let f € Z[x1,...,x,] be symmetric. Prove that f isa polynomial in o1,...,o, with integer 
coefficients. 
(b) Let p be prime and let A € F,[x),..., xn]. Prove that h(x1,...,%n)? = h(xy,...,x2). 


Exercise 7. This exercise is concerned with the proof of Theorem 9.1.9. 
(a) Let ¢ be a primitive nth root of unity, and let i be relatively prime to n. Prove that ¢' is a 
primitive nth root of unity and that every primitive nth root of unity is of this form. 
(b) Let +,,...,y, be distinct primitive nth roots of unity and let i be relatively prime to n. 
Prove that yj,...,y, are distinct. 
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Exercise 8. This exercise will present an alternate proof of (9.8) that doesn’t use symmetric 

polynomials. Assume that ¢ is a root of f such that f(¢?) #0. As in the text, g(x) € Z[x] 

maps to the polynomial g(x) € F, [x]. Let g(x) be as in (9.7). 

(a) Prove that ¢ is a root of g(x”), and conclude that f(x)|g(x?). 

(b) Use Gauss’s Lemma to explain why f(x) divides g(x?) in Z[x], and conclude that f(x) 
divides 2(x?) in F, [x]. 

(c) Use Exercise 7 to prove that g(x)’ = g(x”), and conclude that f(x) divides 2(x)?. 


(d) Now let h(x) € F, |x] be an irreducible factor of f(x). Show that h(x) divides 2(x), so that 
h(x)? divides f(x) g(x). 
(e) Conclude that h(x)? divides x” — 1 € F, [x]. 


(f) Use separability to obtain a contradiction. 


Exercise 9. In proving Fermat’s Little Theorem a? = a mod p, recall from the proof of 
Lemma 9.1.2 that we first proved a?! = 1 mod p when ais relatively prime to p. For general 
n > 1, Euler showed that a®“) = 1 mod n when a is relatively prime to n. Prove this. What 
basic fact from group theory do you use? 


Exercise 10. Prove that a cyclic group of order n has ¢(”) generators. 
Exercise 11. Prove that n = >7,,, $(d). 


Exercise 12. Here are some further properties of cyclotomic polynomials. 
(a) Given n, let m=T[],,, p. Prove that ®,(x) = On (x"/"). This shows that we can reduce 
computing ®,(x) to the case when n is squarefree. 


(b) Let n > 1 be an odd integer. Prove that ®2,(x) = ®,(—x). 
(c) Let p be a prime not dividing an integer n > 1. Prove that ®,,(x) = ®,(x?)/®,(x). 


Exercise 13. We know ®,(x) when p is prime. Use this and Exercise 12 to compute ®15(x) 
and Bios (x) . 


Exercise 14. The Mobius function is defined for integers n > 1 by 


1, ifn=1, 
p(n) = 4 (-1), ifn =pi---p, for distinct primes p),..., Ps, 
0, otherwise. 


Prove that 5°, 4(3) =0 when a > 1. 


Exercise 15. Let : be the Mobius function defined in Exercise 14. Prove that 


$,(x) =] J (x4 - 10. 
d|n 
This representation of ®, (x) is useful when studying the size of its coefficients. 


Exercise 16. Let n and m be relatively prime positive integers. 


(a) Prove that Q(¢,,6n) = Q(Sum)- 
(b) Prove that ®,(x) is irreducible over Q(¢,,). 
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9.2 GAUSS AND ROOTS OF UNITY (OPTIONAL) 


In this section we will explore how Gauss studied Q C Q(G,). where p is an odd 
prime. Working 30 years before Galois, Gauss described the intermediate fields of 
this extension and used his results to show that x? — 1 = O is solvable by radicals. 


A. The Galois Correspondence. If p is an odd prime, then Proposition 9.1.11 
implies that 
Gal(Q(¢,)/Q) = (Z/pZ)*. 
Let us recall what we know about this group: 
e (Z/pZ)* is cyclic of order p — 1 by Proposition A.5.3. 
e For every positive divisor f of p—1, (Z/pZ)* has a unique subgroup H; of order 
f by Theorem A.1.4. 
Following Gauss, we let e = ae Thus 


ef=p-1, 
and Hy has index e in (Z/pZ)*. We will use this notation throughout the section. 
One further fact not mentioned earlier is the following: 
e If f and f’ are positive divisors of p — 1, then Hy C Hp if and only if f|f’. 
You will prove this in Exercise 1. Hence we can easily check when one subgroup is 
contained in another. 


By the isomorphism Gal(Q(¢,)/Q) ~ (Z/pZ)* and the Galois correspondence, 
the intermediate fields of Q C Q(G,) are the fixed fields 


Ly = {a € Q(G,) | o(a) = a for all o with o(¢,) =, [i] € Ay} 


as f ranges over all positive divisors of p — 1. These fixed fields have the following 
nice properties. 


Proposition 9.2.1 The intermediate fields Q C Ls C Q(¢,) satisfy: 

(a) Ly is a Galois extension of Q of degree e. 

(b) If f and f' are positive divisors of p—1, then Ly > Ly if and only if f|f'. 

(c) If f and f’ are positive divisors of p—1 such that f\f’, then Gal(L¢/Lyv) is 
cyclic of order f'/f. 


Proof: You will supply the straightforward proof in Exercise 2. 7 


In particular, if p — 1 = qiq2---q, is the prime factorization of p — 1, then we get 
subfields 


(9.9) Q = Lgyenrg, C aggng, C00* C Lag, C Ly, C Li = UG) 


where [Lq.4-+-q, : Laiqizt--qr] = Gi. Thus every element of Ly,,,...¢, is the root of a 
polynomial of degree q; over Lgig.41---4,+ 

All of this is a simple application of Galois theory. The surprise is that Gauss 
understood most of this, including (9.9). Before discussing Gauss’s results, let us do 
an example. 
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Example 9.2.2 Let p = 7. Then (9.9) with p— 1 =6 = 3-2 becomes 


Q=L6 Cl, CL = Q(¢,), 


where Ly is the fixed field of the unique subgroup of order 2 of Gal(Q(¢;)/Q). 

To make this more explicit, consider m = ¢,+¢7 '=G+G = 2cos(27/7). 
In Exercise 3 you will show that Q() corresponds to the subgroup {e,7} of 
Gal(Q(¢;)/Q), where 7 is complex conjugation. This subgroup has order 2, which 
implies that 


L2 = Q(m). 
In Exercise 3 you will also show that the conjugates of 7 over Q are 
m=G+¢7* =2cos(4n/7) and m=G+¢;? =2cos(6z/7), 
and that 71,7, 73 are roots of the cubic equation 
y+y—2y-1=0. 


It is easy to check that ¢, is a root of x? —mx+1 € L)|x]. From here we can 
express ¢, in terms of radicals as follows. Applying Cardan’s formulas to the above 
cubic, one sees that 


1 1,/7 14/7 
(9.10) m=-3+3 5 (1+3iv3) +5 5 (1 ~3iv3), 


provided that the cube roots are chosen correctly (see Exercise 3). Then applying the 
quadratic formula to x? — 7,x+ 1 = 0 gives 


1 14,/7 . 13/7 . 
G=—Ete 5(1+3iv3) +2 5 (1 -3iv3) 


(9.11) 


where we use the same cube roots as in (9.10). <p> 


Notice how (9.11) is similar to the formula 


1475 i [5475 
$= —4q_ +3V 2 


from Exercise 8 of Section A.2. These formulas were known to Lagrange and 
Vandermonde in the 1770s. Vandermonde also worked out a similar formula for ¢,,, 
which is more surprising in that it required solving an equation of degree 5 by radicals 
(see [Tignol, Ch. 11]). 


B. Periods. In Section VII of Disquisitiones, Gauss proves the existence of radical 
formulas for ¢,, for any odd prime p. His proof uses periods, which for positive 
divisors f of p — 1 are carefully chosen primitive elements of Ly over Q. 
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Let ef = p—1, and let Hy C (Z/pZ)* be the unique subgroup of order f. Given 
an element a = [i] € (Z/pZ)*, set (7 = C). This is well defined, since ¢? = 1. Hence 
we can make the following definition. 


Definition 9.2.3 Let A € Z be relatively prime to p. This gives |\] € (Z/pZ)* and 
the coset |A|H of Hy in (Z/pZ)*. Then we define an f-period to be the sum 


dS. 


a€[A]Hy 
Here are some simple properties of f-periods. 


Lemma 9.2.4 Let ef = p—1, and let (f,) be defined as above. Then: 

(a) Two f-periods either are identical or have no terms in common. 

(b) There are e distinct f-periods. 

(c) The f-periods are linearly independent over Q. 

(d) Let o € Gal(Q(¢,)/Q) satisfy o(¢,) = Gh. Then, for any f-period (f,X), 


o((f,A)) = (fi). 


Proof: Recall that 1,¢,,... GP? € Q(¢,) are linearly independent over Q, since 
[Q(¢,) :Q] = p—1. Multiplying by ¢, shows that the same is true for Cprees CPT! 
This implies that two f-periods coincide if and only if the corresponding cosets of 
Hy are equal. Then part (a) follows because cosets are either identical or disjoint, and 
part (b) because the number of cosets is the index of Hy in (Z/pZ)*, which is e = a. 
Then part (c) is a consequence of part (a) together with the linear independence of 
Cyreeer Ge! over Q. 
For part (d), observe that ¢5 = cH Thus o(¢,) = ¢; implies that 


o(fAy= S5 (G)*= SO e= SO C=), 
ag [A]Hy ae [A]Hy be [iA] Hy 
where the third equality follows via the substitution b = [iJa. " 


Here are some particularly simple periods. 


Example 9.2.5 Since p is odd, the unique subgroup of (Z/pZ)* of order 2 is Hy = 
{[1],[—1]}. The cosets of this subgroup are [A]H2 = {[A], [—A]}, so that the 2-periods 
are 

(2,A) =) + ¢>* = 2cos(21d/p). 


The number of 2-periods is e = 2". 
In particular, when p = 7, the distinct 2-periods are (2, 1), (2,2), and (2,3). These 
were denoted 7, 72, and 73 in Example 9.2.2. <p 


We now prove that f-periods give the desired primitive elements. 
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Proposition 9.2.6 Let Ly be the fixed field of Hy. Then: 
(a) Let (f,A1),..-,(f, Ae) be the distinct f-periods. Then 


B(x) = (x- (F,A1)) + (#— Fe) 


is in Q|x] and is the minimal polynomial of any f-period over Q. 


(b) Any f-period is a primitive element of Ly over Q. 


Proof: An f-period n = (f,A) corresponds to a coset [A]Hy. If [i] € (Z/pZ)*, 
then the f-period (f,iA) corresponding to [iA]Hy is a conjugate of 7 over Q, by 
Lemma 9.2.4. Since [iA]Hy gives all cosets of Hy as we vary |i], the conjugates of 
n over Q are the e distinct f-periods (f,1),...,(f,Ae). Then part (a) follows from 
the formula for the minimal polynomial given in equation (7.1) of Chapter 7. 

Hence Q C Q(n) and Q C Ly are extensions of degree e. Since Gal(Q(¢,)/Q) = 
(Z/pZ)* has a unique subgroup of index e, the Galois correspondence implies that 
Q(n) = Ly. This proves part (b). . 


As a corollary, we get the following interesting basis of Ly over Q. 


Corollary 9.2.7 The f-periods form a basis of Ly over Q. 


Proof: The f-periods lie in Ly by Proposition 9.2.6. Furthermore, Lemma 9.2.4 
tells us that the e such periods are linearly independent over Q. The corollary follows, 
since [Ly : Q] =e by Proposition 9.2.1. 7 


Our next task is to describe the extension Ly C Ly in terms of periods, where f and 
f’ are positive divisors of p—1 satisfying f|f’. Set d = f’/f, so that [Lp: Ly] =d. 
Any f-period (f,A) is a primitive element of Ly over Ly. We need to describe its 
minimal polynomial over Ly. 

This is done as follows. Observe that Hy is a subgroup of index d = f’/f in Hy. 
Hence every coset of Hy in (Z/pZ)* is a disjoint union of d cosets of Hy. (Do 
Exercise 4 if you are unsure of this.) In particular, [A]H is a disjoint union 


(9.12) [AJA pr = [AiJAeU---U [AalAy, 


where we may assume A, = A, since [A]Hy C [A]Hy. This leads to the following 
description of the desired minimal polynomial. 


Proposition 9.2.8 Let f and f’ be positive divisors of p—1 such that f|f’, and set 
d= f'/f. Given an f-period (f,X), let \y = ,Aa,.--, Ag be as in (9.12). Then 


h(x) = (x (f,1)) + (@— (Fa) 


is in Ly [x] and is the minimal polynomial of (f ,) over Ly. 
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Proof: The proof is similar to what we did in part (b) of Proposition 9.2.6. Setting 
7 = (f,), we need to show that as o varies over Gal(Q(¢,)/Ly), the elements o(7) 
give the f-periods (f,;),...,(f, Ag). 

To prove this, let ¢ € Gal(Q(¢,)/Ly), so that o(¢,) = Ci for [i] € Hy. Then 


a(n) =o((f,A)) = (fi). 
This f-period corresponds to the coset [iA]H;. However, 
AH, C [iAH = [NLA = DJAp, 


where the final equality uses [i] € Hy. By (9.12), it follows that [iA]H¢ = [Aj|Hy for 
some j, so that o(7) = (f,iA) = (f,A;). Since every (f,;) arises in this way (see 
Exercise 5), the proposition is proved. 7 


We will give an example of Proposition 9.2.8 below. 


C. Explicit Calculations. The above results are pretty but somewhat abstract. To 
compute specific examples, we need a concrete way to work with periods. The key 
idea, due to Gauss, is to pick a generator |g] of the cyclic group (Z/pZ)*. Since this 
group has order p — 1, it follows that 


(Z/pZ)* - {{1], [g], [g7],--- [g?7]}. 


In other words, the p — 1 numbers 1,¢, g*,...,g?~* represent the nonzero congruence 
classes modulo p. We call g a primitive root modulo p. 

Given a primitive root g and ef = p— 1 as usual, Exercise 1 implies that Hy is 
generated by g°, i-e., 


Hy = {(s le] le*]--- le *T}. 
It follows that the coset [A]Hy gives the f-period 


f-! ; 
(9.13) (FA) = CA + GDF 4 CA pe tC = STO 
j=0 


So far, we have assumed that [A] € (Z/pZ)*, i.e., pt. However, (9.13) makes sense 
for any integer A. Since C2 = 1, one easily sees that 


(f,4)=f when ld. 


For an arbitrary 4 € Z, we call (f, A) a generalized period. Thus a generalized period 
is an ordinary period if p{ X and is equal to f if p|A. 

In order to compute the minimal polynomials appearing in Propositions 9.2.6 
and 9.2.8, we need to know how to multiply f-periods. Gauss expressed the product 
of two f-periods in terms of generalized periods as follows. 
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Proposition 9.2.9 If (f,) and (f,:) are f-periods with p{ and p{ p, then 


f-l 
FARM = So FN +H) = Sof Ag +H). 
[Ae [A] Hy j=0 


Proof: Following [5, Art. 345], we set h = g°, so that 
f-l 
= 2 
£=0 


since [h] generates Hy. We also have [A]Hy = [\h°]H; for any £, which implies that 
(f,d) = (f, An®). Thus 


fri f-l 
é é 
CF AFH) = OPA = DOF an) cH 
e=0 £=0 
f-l ' 1 fel oo 
-¥ (Soro -E (Lo) 
(x =0 ‘¢=0 
f-l 
= STU +1). 
j=0 
This gives the desired formula, since h = g°. . 


Here is an example from [5, Art. 346]. 


Example 9.2.10 In this example and three that follow, we will consider the 6-periods 
for p = 19. In Exercise 7 you will show that g = 2 is a primitive root modulo 19. 
Since f = 6 implies e = 3, the unique subgroup of order 6 in (Z/19Z)* is generated 
by [2]? = [8]. Thus 


= {(1), 18], (8), 18]°, [8]*, (8)°} = {(1), 18), (7), (18), [11], [12]}  (Z/19Z)*. 
For simplicity, we will write [n] as n, so that 
Hg = {1,7,8, 11,12, 18}. 
The e = 3 cosets of Hg in (Z/19Z)* are He together with 


2H = {2, 14, 16,22, 24,36} = {2,3,5,14,16, 17}, 
4H = {4,28, 32,44, 48,72} = {4,6,9, 10, 13,15}. 
(Remember that we are working modulo 19.) 
According to Proposition 9.2.9, 
(6,1)? = (6,14+1)+(6,7+1)+(6,8+1)+ (6, 11+1)+(6,12+1)+(6, 1841) 
= (6,2) + (6,8) + (6,9) + (6,12) + (6,13) +6, 
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where the second equality uses (6,19) = 6. This shows that generalized periods can 
arise when we multiply ordinary periods. However, 


(6,8) = (6,1) 
since 8 and 1 lie in the same coset of Hg. Using similar simplifications, we get 
(6,1)? = 2(6,1) + (6,2) + 2(6,4) +6. 
By Exercise 6 we also have 
(6,1) + (6,2) + (6,4) = — 
Then the formula for (6,1)? simplifies to 
(6,1)? =4— (6,2). 
You will work out similar formulas in Exercise 7. <p 


Example 9.2.11 Still assuming p = 19, our next task is to compute the minimal 
polynomial of the 6-periods over Q. We will use the notation of the previous 
example. By Proposition 9.2.6, the minimal polynomial is 


(9.14) (x — (6, 1)) (x — (6,2)) (x — (6,4)). 
In Exercise 7 you will use the methods of Example 9.2.10 to show that 
(6, 1)(6,2) = (6,1) +2(6,2) + 3(6,4), 


(6, 
(9.15) (6, 1)(6,4) = 2(6, 1) +3(6,2) + (6,4), 
(6,2)(6,4) = 3(6,1) + (6,2) + 2(6,4). 
Note that these sum to 6(6, 1) + 6(6,2) + 6(6,4) = —6, since (as noted above) (6, 1) + 
(6,2) + (6,4) = — 
Using (9.15) and (6, 1)? = 4 — (6,2) (from Example 9.2.10), we have 


(6, 1)(6,2) (6,4) = (6,1) (3(6, 1) + (6,2) + 2(6,4)) 
= 3(6, 1)? + (6,1)(6,2) + 2(6, 1)(6,4) 
= 12+45(6,1)+5(6,2) +5(6,4) =7 


(see Exercise 7 for the details). It follows that multiplying out (9.14) gives 
(9.16) x 4x7 -—6x—7. 


This is the minimal polynomial of the 6-periods over Q. Its splitting field is Q C Le, 
the extension generated by the 6-periods. <P 
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Example 9.2.12 Now consider the 3-periods for p = 19. Since 6/3 = 2, we see 
that L¢ C Ly is an extension of degree 2. Hence 3-periods have quadratic minimal 
polynomials over Le. 

Since 2 is a primitive root modulo 19, the subgroup H3 Cc (Z/19Z)* is generated 
by [2]° = [8]? = [7]. Using the notation of Example 9.2.10, we have 


Ho = {1,7,8, 11, 12, 18} = {1,7, 11} U {8, 12, 18} = H3 U8H3. 


This shows that 
(6,1) = (3,1) + (3,8), 


and in a similar way, one obtains 


(6,2) = (3,2) + (3, 16), 
(6,4) = (3,4) + (3, 13). 


However, Proposition 9.2.9 implies that 
(3,1)(3,8) = (3,148) + (3,7 +8) + (3,11 +8) = (3,9) + (3, 15) +3, 
and since (3,9) = (3,4) and (3,15) = (3, 13) (do you see why?), we get 
(3, 1)(3,8) = (3,4) + (3,13) +3 = (6,4) +3. 
By Proposition 9.2.8, the minimal polynomial of (3, 1) and (3,8) over L¢ is 
(9.17) (x — (3, 1)) (x— (3,8)) =x? — (6, 1)x+ (6,4) +3. 
Exercise 7 will consider the minimal polynomials of the other 3-periods. <p> 


Example 9.2.13 The 1-periods for p = 19 are the primitive 19th roots of unity 
(1,4) = Cf for A= 1,...,18. In Example 9.2.12, we noted that H; = {1,7,11}, 
which means that 

(3,1) = Cig + Cig + Ci9- 


By Exercise 7 the minimal polynomial of ¢,9, Cj, and C/g over Ls is 


(9.18) (x= Gg) (2 — Io) (e— CH) = x? — (3, 1)x? + (3,8)x- 1. 
Combining this with (9.17) and (9.16), one can write an explicit formula for ¢,, that 
involves only square and cube roots. <> 


In Exercises 8 and 9 you will use similar methods to derive the formula 
cos(2/17) = — 7g + V17 + 7g V34— 2V17 


+h 174 3V17— 34 —2V17 — 234 + 2V17, 


due to Gauss. In Chapter 10, we will see that this leads immediately to a straightedge- 
and-compass construction of a regular 17-gon. 


(9.19) 
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One reason these methods work so well is that the f-periods are linearly inde- 
pendent over Q by Lemma 9.2.4. Hence any linear combination of f-periods with 
coefficients in Z or Q is unique. However, we’ve seen cases where generalized 
periods (f,), p|A, also occur. But this is no problem, since (f,A) = f in such a 
case, and we also know that the distinct f-periods sum to —1 (see Exercise 6). Thus 
a generalized f-period can be expressed in terms of ordinary f-periods. Hence we 
can always reduce to an expression involving only f-periods, where we know that 
the representation is unique. 


D. Solvability by Radicals. When studying Q C Q(¢,), we saw in (9.9) that a 
prime factorization p — 1 = q,q2---q, gives intermediate fields 


Q =Lg)..-g, C Lgyg, Cos C Lg _ig, C Lg, Cli = Q(G,) 


such that [Lg,,,...g, : Laigiai---q-] =i If we focus on one of these fields and the next 
larger one, then we get an extension of the form 


(9.20) L¥¢q Cc Ly 


where fq divides p — 1 and q is prime. The theory of periods shows that (f,1) is 
a primitive element of Ly and the examples given above make it clear that in any 
particular case we can compute the minimal polynomial of (f,1) over Ly,. 

When p = 19, the minimal polynomials found in Examples 9.2.10—-9.2.13 have 
degrees 2 or 3. Hence their roots can be found by known formulas. But when p = 11, 
the period (2,1) = 2cos(27/11) has minimal polynomial 


y +y4—4y>—3y? 4 3y41 


(see Exercise 10). Is this polynomial solvable by radicals? More generally, are the 
minimal polynomials of periods solvable by radicals? 

For a theoretical point of view, this question is trivial, since Q C Q(¢ p) isa radical 
extension (¢? = 1 € Q). It follows by definition that any f-period (f, A) is expressible 
by radicals over Q, since (f,) € Q(¢,). Things become even more trivial if you 
recall that when we studied solvability by radicals in Chapter 8, we felt free to adjoin 
any roots of unity we needed, including ¢,.. 

Hence it appears that solving the minimal polynomials of periods by radicals is 
completely uninteresting. The problem is that this ignores the inductive nature of 
what’s going on. The real goal, which goes back to Lagrange’s strategy for solving 
equations, is to construct pth roots of unity using only radicals and roots of unity of 
lower degree (we will discuss Lagrange’s strategy in Chapter 12). This is what Gauss 
does in Disquisitiones. 

Thus, when studying Q Cc Q(¢,). we may assume inductively that we know all 
mth roots of unity for m < p. Furthermore, as explained in the discussion preceding 
(9.20), it suffices to consider the extension Ly, C Ly, where we may assume that the 
fq-periods are known. The idea is to express an f-period in terms of radicals that 
are qth roots involving fq-periods and gth roots of unity. These roots of unity are 
known, since qg < p. 
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To do this in practice, we will use Lagrange resolvents. Let w be a primitive gth 
root of unity. In Exercise 11 you will prove that 


Gal(Ly(w)/Lyq(w)) ~ Gal(Ly/Lyq) ~ Z/qZ. 


Since Ls, (w) contains a primitive gth root of unity, Lemma 8.3.2 implies that L ¢(w) is 
obtained from L,(w) by adjoining a gth root. Furthermore, the proof of Lemma 8.3.2 
shows that the element adjoined is a Lagrange resolvent. Recall from (8.7) that if 
o is a generator of Gal(Ls(w)/Lyg(w)) and 6 € Ly(w), then we get the Lagrange 
resolvents 

aj = B+w!o(B) +--+ +w- 4 o 1! (8) 


fori=0,...,qg—1. We will use 8 = (f,1) € Ly C Ly(w). In Exercise 11 you will 
show that we can pick the generator o so that for any f-period (f, A), 


(9.21) o((f,A)) = (fi8’/4A) 
(note that gle, since fq|p — 1). Thus the above Lagrange resolvents can be written 
(9.22) a; = (f,1) +wi(f,9°/%) foe tw l-Dp, gla-Ne/ay, 


If we set A; = a, then we can define ¥/A; = a. Then the f-periods in (9.22) can be 
expressed in terms of radicals as follows. 


Theorem 9.2.14 Let a; and A; = a be defined as above. 
(a) ao € Lyg(w) and A; = of € Lyg(w) for 1 <i<q-1. 
(b) ForO0<£<q-1, 


1 
(fg!) = AGEs V/A, +u#4 Apo tw V/A). 


Before beginning the proof, let us explain the f-periods appearing in the theorem. 
The extension Ly, C Ly corresponds to the subgroups Hy C Hyg of (Z/pZ)*. Since 
e= es, Exercise 1 shows that these subgroups are generated by [g°] and [g°/%, 
respectively. In Exercise 11 you will use this to prove that 


(9.23) Hyg =H Ug'/4H; Ug*/9H,U--Ug*-D/aH,, 
By Proposition 9.2.8, the f-periods (f,g°¢/%) are the conjugates of (f,1) over Lyg. 


Proof of Theorem 9.2.14: Part (a) follows easily from the properties of Lagrange 
resolvents presented in the proof of Lemma 8.3.2. For part (b), let Az = g°/?. Then 
for any integer m we have 


q-} q-1 q-\ 
Futian = Suu HF, ¥)) 
i=0 i=0 é=0 

q-|1 q-l 


= (Sw) (6,0) =a(f rw); 


é=' i=0 
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where the last equality follows from Exercise 9 of Section A.2. This gives the desired 
formula for (f, Am), since a; = ~/A; for i > 0. = 


From a computational point of view, the results of this section give a systematic 
method for expressing A; = a? in terms of fg-periods and gth roots of unity. This 
works because f-periods and fq-periods are linearly independent not only over Q 
but also over Q(w), where w is a primitive gth root of unity (you will prove this 
in Exercise 12). Thus the radical formula for (f,2/?) given in Theorem 9.2.14 is 
explicitly computable. 


Mathematical Notes 
Here are comments on two topics relevant to what we did in this section. 


« Primitive Roots Modulo p. The formulas presented in this section illustrate the 
usefulness of knowing primitive roots modulo p. Gauss explains a method for finding 
primitive roots in [5, Art. 73-74]. See also [10, p. 163]. 

Let g, denote the smallest positive primitive root modulo p. For example, 2 is a 
primitive root modulo 19, which implies that gj9 = 2. In 1962 Burgess [3] proved 
that for any € > O there is a positive constant C(e) such that 


8S C(e)ptté 


for all odd primes p. This says that g, can’t be too big relative to p. On the other 
hand, Kearnes [9] proved in 1984 that given any integer m > 0 there are infinitely 
many primes p > m such that g, > m. So g, can still get large. 

If we fix a primitive root g modulo p, then the discrete log problem asks the 
following: Given an integer a not divisible by p, find i such that a = g' mod p. 
We write this as i= log,a. It is easy to describe an algorithm for finding log,a 
(divide a—g' by p for i=0,1,2,..., and stop when the remainder is zero). But 
finding an efficient algorithm for log,a is much more difficult. Several modern 
encryption schemes, including the Pohig—Hellman symmetric key exponentiation 
cipher (described in [10, Sec. 3.1]) and the Diffie-Hellman key exchange protocol 
(described in [2, Sec. 7.4] and [10, Sec. 3.1]), would be easy to break if discrete logs 
were easy to compute. 

Primitive roots modulo p are also used in the Digital Signature Algorithm sug- 
gested by the National Institute of Standards and Technology. A description can be 
found in [2, Sec. 11.5]. As above, one could forge digital signatures if discrete logs 
were easy to compute. 

There are also purely mathematical questions about primitive roots modulo p. A 
list of unsolved problems can be found in [7, Sec. F.9]. 


= Periods and Gauss Sums. Let p = 17. By Exercise 9 we have 


(8,1) = 4(-14v17), 
(8,3) = $(-1-Vv17), 
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which easily implies 


(8,1) — (8,3) = VI7. 


In Exercise 13 you will show that this can be written 


(9.24) 5 ( =) Cf = v17, 


where, for an odd prime p, the Legendre symbol (S) is defined by 


0, if pla, 
(< =4¢+41, if pta, x* =a mod p has a solution, 
—1, if pta, x* =a mod p has no solution. 


More generally, for an odd prime p, a quadratic Gauss sum is defined to be 


=D ()o 


a=0 


Gauss used these sums to prove quadratic reciprocity. He also proved the remarkable 
formula 
_ J) vp, if p=1 mod 4, 
a liyp, if p=3 mod 4. 


Notice how this generalizes (9.24). A careful discussion of quadratic Gauss sums 
can be found in [8, Ch. 6]. 


Historical Notes 


Most results of this section are implicit in Section VII of Disquisitiones. The 
main difference is that we have stated things in terms of the Galois correspondence, 
which to each divisor f of p— 1 associates the subgroup Hy of (Z/pZ)* and the 
subfield Ly of Q(¢ p)- For Gauss, on the other hand, each divisor f gets associated to 
the collection of f-periods (f, A). In general, he considers elements rather than the 
fields in which they lie. For example, consider [5, Art. 346], which asserts that given 
(f,), any other f-period (f, 2) can be expressed as 


(f,u) = a0 +ai(f, A) +02(f, A)? +++ ei (f, A)! 


for some uniquely determined integers ao,...,@e—1. For us, this gives the unique 
representation of (f, 1) as an element of Ly = Q((f,.)). 

Another difference is our use of cosets. For example, if g is a primitive root 
modulo p and f divides p — 1, then Gauss notes that the distinct f-periods are 


(f:1),(f.8)(f:8")s-- 0 (f8* |) 
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where e = an For us, this follows from Lemma 9.2.4, since Hy C (Z/pZ)* is 
generated by [9°], so that its cosets in (Z/pZ)* are 


[1JHy, [g\4y, [ely [gy 


Cosets give a conceptual basis for what Gauss is doing, and the same is true for the 
minimal polynomials computed in Proposition 9.2.8. 

It is also interesting to note that Gauss makes implicit use of the Galois group 
Gal(Q(¢,)/Q). We saw in Lemma 9.2.4 that o(¢,) = ¢f implies that o((f,)) = 
(f,k). Now consider the following quote from [5, Art. 345}: 


IV. It follows that if in any rational integral algebraic function F = ¢(t,u,v,...) 
we substitute for the unknowns t,u,v, etc. respectively the similar periods 
(f,A), (Ff, 4); (f,1), ete., its value will be reducible to the form 


A+B(f,1)+B'(f,g) +B" (f,8°)-.. +B°(f,8° ') 


and the coefficients A, B, B’, etc. will all be integers if all the coefficients in F are 
integers. But if afterward we substitute (f,kA), (f,ku), (f,kv), etc. for t, u,v, 
etc. respectively, the value of F will be reduced to A + B(f,k) + B’(f,kg) + ete. 


A “rational integral algebraic function” is a polynomial with coefficients in Q. Here 
is an example of what this means. 


Example 9.2.15 In Example 9.2.10, we showed that 
(6,1)? =4— (6,2) 

when p = 19. Using k = 2, the above quotation from Gauss tells us that 
(6,2)? =4 — (6,4). 


In modern terms, this follows by applying the automorphism o € Gal(Q(¢)9)/Q) 
that takes ¢,, to 4. So the Galois action is implicit in Gauss’s theory! <> 


Gauss’s result that x? — 1 is solvable by radicals is less compelling from the modern 
perspective, though it is still interesting when one thinks inductively. But historically, 
being able to solve special but nontrivial equations of high degree was important. 
Here is what Gauss says in [5, Art. 359]: 


Everyone knows that the most eminent geometers have been unsuccessful in 
the search for a general solution of equations higher than the fourth degree, 
or (to define the search more accurately) for the THE REDUCTION OF MIXED 
EQUATIONS TO PURE EQUATIONS. ... Nevertheless, it is certain that there 
are innumerable mixed equations of every degree that admit a reduction to pure 
equations, and we trust that geometers will find it gratifying if we show that our 
equations are always of this kind. 


For Gauss, an equation is “‘pure” if it is of the form x” — A = 0 and “mixed” otherwise. 
Thus, reducing “mixed equations to pure equations” is what we call solvability by 


GAUSS AND ROOTS OF UNITY (OPTIONAL) 251 


radicals. Of course, in saying “our equations,” Gauss is referring to the minimal 
polynomials satisfied by the periods, as constructed in Proposition 9.2.8. 

Gauss’s study of the pth roots of unity is an important midpoint in the development 
leading from Lagrange to the emergence of Galois theory. Gauss uses Lagrange’s 
inductive strategy to work out the Galois correspondence for Q C Q(¢,), and his 
theory of periods makes everything explicit and computable. He also shows that 
Lagrange resolvents are the correct tool for studying solvability by radicals, paving 
the way for Galois’s analysis of the general case. 

In spite of its beauty, what Gauss does in Section VII of [5] is not perfect. Some 
proofs are omitted and others have gaps. For example, Gauss does not prove the 
assertion about the Galois action made in the quotation before Example 9.2.15. Also, 
as noted in [Tignol, p. 195], Gauss’s study of solvability assumes without proof that 
when fq divides p— 1, the f-periods are linearly independent over the field generated 
by the gth roots of unity. (You will prove this in Exercise 12.) 

Galois was very aware of Section VII of Disquisitiones. For example, Galois 
describes the “group” of Q C Q(¢,), n prime, as follows [Galois, pp. 51-53]: 


xt 


. 2 
In the case of the equation *=" = 0, if one supposesa =r,b=r’,c=r*,..., 
g being a primitive root, the group of permutations will simply be as follows: 


a b c d... =k 
b c dd... =k a 
c dad... k a b 
k a b (| 


in this particular case, the number of permutations is equal to the degree of 
the equation, and the same will be true for equations where all of the roots are 
rational functions of each other. 


Here, r is a primitive nth root of unity. Each line is a cyclic permutation of the one 
above it, which leads to a cyclic group of order n — 1. This quotation also reveals that 
for Galois, a “permutation” was an arrangement of the roots and that the permutations 
(in the modern sense) are obtained by mapping the first arrangement in the table to 
the others. You will work out the details of this in Exercise 14. We will say more 
about how Galois thought about Galois groups in Chapter 12. 


Exercises for Section 9.2 


Exercise 1. Let G be a cyclic group of order 7 and let g be a generator of G. 

(a) Let f be a positive divisor of n and set e=n/f. Prove that Hy = (g°) has order f and 
hence is the unique subgroup of order f. 

(b) Let f and f’ be positive divisors of p — 1. Prove that Hy C Hy if and only if f|f’. 


Exercise 2, Prove Proposition 9.2.1. 
Exercise 3. Let ,72,73 be as in Example 9.2.2. 
(a) We know that ¢, is a root of x6 +.2° +24 +2? +x? +x+1=0. Dividing by x? gives 


xtertati¢attx7 4x7 30. 
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Use this to show that 71,72, are roots of y? + y? — 2y— 1. 


(b) Prove that (Q(7): Q] = 3, and conclude that Q(m) is the fixed field of the subgroup 
{e,7} C Gal(Q(¢,) /Q), where 7 is complex conjugation. 
(c) Prove (9.10). 


Exercise 4. Let A C B be subgroups of a group G, and assume that A has index d in B. Prove 
that every left coset of B in G is a disjoint union of d left cosets of A in G. 


Exercise 5. Complete the proof of Proposition 9.2.8. 
Exercise 6. Prove that the sum of the distinct f-periods equals —1. 


Exercise 7. This exercise is concerned with the details of Examples 9.2.10, 9.2.11, 9.2.12, and 
9.2.13. 
(a) Show that 2 is a primitive root modulo 19. 
(b) Use the methods of Example 9.2.10 to obtain formulas for (6,2)? and (6,4)?. 
(c) Show that the formulas of part (b) follow from (6,1)? = 4 — (6,2) and part (d) of 
Lemma 9.2.4. 
(d) Prove (9.15) and use this and Exercise 6 to show that (6, 1)(6,2)(6,4) =7. 
(e) Find the minimal polynomials of (3,2) and (3,4) over the field Lg considered in Exam- 
ple 9.2.12. 
(f) Show that (9.18) is the minimal polynomial of ¢,, over the field L3 considered in Exam- 
ple 9.2.13. 


Exercise 8. In this exercise and the next, you will derive Gauss’s radical formula (9.19) for 
cos(27/17). 

(a) Show that 3 is a primitive root modulo 17. 

(b) Show that 


Hg = {1,2,4,8,9, 13, 15, 16}, 
Hs = {1,4, 13, 16}, 
Hz = {1,16}, 


where we write the congruence class [n] modulo 17 as n. 
(c) Use Propositions 9.2.8 and 9.2.9 to compute the following minimal polynomials: 


Extension } Primitive Elements | Minimal Polynomial 


The resulting quadratic equations are easy to solve using the quadratic formula. But how do 
the roots correspond to the periods? For example, the roots (8, 1),(8,3) of x? +x —4 are 
(—1+ V17)/2. How do these match up? The answer will be given in the next exercise. 


Exercise 9. In this exercise, you will use numerical computations and the previous exercise to 
find radical expressions for various f-periods when p = 17. 


GAUSS AND ROOTS OF UNITY (OPTIONAL) 253 


(a) Show that 


(8,1) = 2cos(2m/17) + 2cos(4a/17) + 2cos(82/17) + 2cos(167/17) 
(4,1) = 2cos(27/17) + 2cos(87/17) 

(4,3) = 2cos(62/17) + 2cos(107/17) 

(2,1) = 2cos(27/17). 


Then compute each of these periods to five decimal places. 
(b) Use the numerical computations of part (a) and the quadratic polynomials of Exercise 8 
to show that 


(8,1) = 4(—1+ 117), 
(8,3) = 


(4,2) =!1( -14+V17- 134-2 7) 


(4,3) = i(-1-vi7+ y3442Vv17). 


(c) Use the quadratic polynomial x” — (4, 1)x-+ (4,3) and part (b) to derive (9.19). 


Exercise 10. Let p = 11. Prove that y° + y* — 4y? — 3y? + 3y+1 is the minimal polynomial 
of the 2-period (2, 1) = 2cos(27/11). 


Exercise 11. Let Ly, C Ly be the extension studied in Theorem 9.2.14. Thus f and fq divide 
p—|1, and q is prime. As usual, ef = p— 1 and g is a primitive root modulo p. Finally, let w 
be a primitive qth root of unity. 
(a) Let r € Gal(Q(¢,)/Q) satisfy r(¢,) = oe", and let o’ = Thy be the restriction of 7 to 
Ly. Prove that o’ generates Gal(L¢/Lyq). 
(b) Prove that Gal(L;(w)/Lyg(w)) ~ Gal(Ly/Lyq), where the isomorphism is defined by 
restriction to Ly. 
(c) Leto € Gal(L(w)/Lyq(w)) map to the element o’ € Gal(L; /Lyq) constructed in part (a). 
Prove that o satisfies (9.21). 
(d) Prove the coset decomposition of Hyg given in (9.23). 


Exercise 12. Let p be an odd prime, and let m be a positive integer relatively prime to p. 
(a) Prove that 1,¢,,.--, Ge are linearly independent over Q(¢,,). 

(b) Explain why part (a) implies that ¢,, ..., cet are linearly independent over Q(¢,,). 
(c) Let f|p — 1. Prove that the f-periods are linearly independent over Q(¢,,). 


Exercise 13. Prove (9.24). 


Exercise 14. Consider the quotation from Galois given at the end of the Historical Notes. 

(a) Show that the permutations obtained by mapping the first line in the displayed table to 
the other lines give a cyclic group of order n — 1. Also explain how these permutations 
relate to the Galois group. 

(b) Explain what Galois is saying in the last sentence of the quotation. 


Exercise 15, What are the 1-periods? 
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Exercise 16. Redo Exercise 3 using periods. 


Exercise 17. Let f be an even divisor of p— 1, where p is an odd prime. Prove that every 
f-period (f, 4) lies in R. 
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CHAPTER 10 


GEOMETRIC CONSTRUCTIONS 


The idea of geometric constructions using straightedge and compass goes back to the 
ancient Greeks. A straightedge is an unmarked ruler. This chapter will explore the 
surprising connection between geometric constructions and Galois theory. Topics 
covered include classic problems from Greek geometry, the work of Gauss described 
in Chapter 9, and the use of origami to solve cubic and quartic equations. 


10.1 CONSTRUCTIBLE NUMBERS 


We assume that you remember how to do standard straightedge-and-compass con- 
structions such as bisecting a given angle, erecting a perpendicular to a given line at 
a given point, and dropping a perpendicular from a given point to a given line. 

To prove theorems about geometric constructions, we need a careful description 
of what a construction is. The basic idea is that we begin with some known points. 
From these points, we use straightedge and compass to construct lines and circles: 


Cl. From a ¥ £, we can draw the line @ that goes through a and /. 
C2. From a # £ and 4, we can draw the circle C with center -y whose radius is the 
distance from a to £. 
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In this labeling, “C” stands for “Construct.” From these lines and circles, their points 
of intersection (assuming they are nonempty) give new points: 


PIL. The point of intersection of distinct lines £, and £2 constructed as above. 
P2. The points of intersection of a line £ and circle C constructed as above. 
P3. The points of intersection of distinct circles C) and C2 constructed as above. 


Here, “P” stands for “Point.” We regard these newly constructed points as known. 
Then we can apply C1, C2, P1, P2, and P3 to our enlarged collection of known points. 
We continue this process until the construction is completed. 

For us, the plane will be the field of complex numbers C, so that constructing a 
point means constructing a complex number. Our constructions will all start from 
the same two numbers, 0 and 1. This leads to the following definition. 


Definition 10.1.1 A complex number a is constructible if there is a finite sequence 
of straightedge-and-compass constructions using C1, C2, P|, P2, and P3 that begins 
with 0 and | and ends with a. 


Here are some examples of constructible numbers. 


Example 10.1.2 


(a) From 0 and |, we get the x-axis using Cl and the circle of radius 1 centered at 
1 using C2. These intersect in the numbers 0 and 2. By P2, 2 is constructible. 
Iterating this shows that every n € Z is constructible. 


(b) In Exercise 1 you will use standard methods from high school geometry to 
construct the line perpendicular to the x-axis at 0. This will show that the y-axis 
is constructible. Then use C2 to construct the circle of radius 1 centered at 0. 
These intersect in ti. By P2, i € C is constructible. 


These constructions will be useful later. <p> 


Example 10.1.3 Suppose that we can construct a regular polygon with n sides some- 
where in the plane (we call this a regular n-gon). Using two consecutive vertices and 
the center of the n-gon, we get the triangle shown here: 


(We may have constructed the center in the process of constructing the n-gon. If 
not, then the center is constructible, since it is the intersection of the bisectors of the 
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interior angles.) One easily sees that the angle at the center is 6 = 277/n. In Exercise 2 
you will show how to copy this angle to the origin: 


Intersecting this with the unit circle shows that the nth root of unity ¢, = e27t/n is 
constructible. In Exercise 2 you will show that this process can be reversed. Hence 
¢,, is constructible if and only if a regular n-gon can be constructed by straightedge 
and compass. Section 10.2 will determine those n’s for which this is possible. <> 


The set of constructible numbers has the following properties. 


Theorem 10.1.4 The set @ = {a € C | @ is constructible} is a subfield of C. Fur- 
thermore: 

(a) Leta=a+tibeC, where a,be€ R. Thena € @ if and only ifa,b € @. 

(b) a€ @ implies that fa € @. 


Proof: We first show that @ is a subgroup of C under addition. Given a € @ \ {0}, 
construct the line connecting 0 and a by C1 and the circle of radius |a| centered at 
the origin by C2. These intersect in ta, so that —a is constructible by P2. 

Now suppose that q@ and £ are constructible. If a, 8, and 0 are not collinear, then 
use C2 twice to construct the circle of radius |a| with center @ and the circle of radius 
|G| with center a. One of the points of intersection is a + (: 


a+ fp 


0 


By P3, we conclude that a + ( is constructible. In Exercise 3 you will show that this 
is also true when a, , and 0 are collinear. Since 0 € & by definition, it follows that 
@ is a subgroup of C under addition. 

We next prove part (a). Given a = a+ ib € @, we can drop perpendiculars from 
@ to the x-axis and y-axis constructed in Example 10.1.2. This shows that a, ib € @. 
Since the circle of radius |ib| centered at 0 intersects the x-axis at b, C2 and P2 imply 
that b € @. Conversely, given a,b € @ NR, applying C2 and P2 to the circle of radius 
|b| centered at 0 shows that ib is constructible. By the previous paragraph, a + ib is 
constructible and part (a) follows. 
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Now let a,b € @M {x € R | x > 0} and consider the following two figures: 


ib ia 
i i 
t o| t il 
c=ab d=1l1/a 
Figure 1 Figure 2 


Recall that i was constructed in Example 10.1.2. In Figure 1 we construct ib as above 
and then use C1 to draw the line / containing i and a. By high school geometry, we 
can draw the line /’ through ib that is parallel to 1. Then P1 shows that /’ and the 
X-axis intersect at a constructible real number c. But c = ab follows easily by similar 
triangles, so that ab is constructible. In a similar way Figure 2 shows that 1/a is 
constructible. We leave the details as Exercise 3. This exercise will also show that 
€@ OR is a subfield of R. 

To show that @ is closed under multiplication and taking reciprocals of nonzero 
elements, let a = a+ib and 8 =c + id be constructible numbers. Then 


a8 = (a+ ib)(c+ id) =ac— bd + i(ad + be). 


However, a,b,c,d € @ MR by part (a), so that ac — bd,ad + bc € @ OR, since the 
latter is a subfield of R. Using part (a) again, we conclude that af € @. Furthermore, 
if a ~ 0, then 


1 1 a-ib_ a —b 


Q  atiba—ib @ib | eae 


Using part (a) and the fact that @ MR is a subfield of R, we easily see that 1/a € @. 
Thus @ is a subfield of C. 

Finally, we show that ./a is constructible when a is. We can assume that a 4 0. 
If we write a = re’, r = |a| > 0, then it suffices to show that \/re’*/? is constructible. 
To prove this, note that the constructibility of a implies the following: 


e First, using the x-axis and the line containing 0 and a (by C1), we can construct 
the angle 6, which we can then bisect by the usual straightedge-and-compass 
construction. Thus the angle 0/2 is constructible. 


e Second, the circle of radius r = |a| centered at 0 (by C2) intersects the x-axis at 
tr. By P2, we see that r is constructible. 


e Third, if we can construct ,/r, then we can construct the circle of radius \/r 
centered at the origin by C2. Then P2, applied to this circle and the angle 0/2 
constructed above, implies that \/re’*/? is constructible. 
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To study ,/r, let r > 0 be constructible and define the point 9 by the diagram 


§ = the intersection 
of the line x = 1 


(10.1) . . 
with the semicircle 


In Exercise 3 you will show that @ is constructible. By Euclidean geometry, the 
triangle with vertices 1, 8, and 1+ risa right triangle. The two smaller triangles that 
share the side determined by 1 and / are similar, so that 


where d is the distance from | to 8. Thus d* = r and hence d = ,/r. Since d is easily 
seen to be constructible, we conclude that ,/r is constructible. a 


Here is an example of how to use Theorem 10.1.4. 
Example 10.1.5 By Exercise 8 of Section A.2, ¢; = e?"/° is given by the formula 


-14+V¥5 i [545 
G=—q_ +3 2° 


Since the field @ is closed under the operation of taking square roots, it follows easily 
that ¢; is constructible. By Example 10.1.3, we conclude that a regular pentagon can 
be constructed by straightedge and compass. <— 


We call @ the field of constructible numbers. We next study the structure of @. 


Theorem 10.1.6 Let a be a complex number. Then a € © if and only if there are 
subfields 
Q=hKhCKc:-CcCRh1CRcC 


such that a € F, and |F;:Fj-\| =2 forl<i<n 


Proof: First suppose that we have Q = Fy C --- C F, C C where [F;: j_1] = 2. By 
Exercise 12 of Section 7.1, F; = R—1(,/ai) for some a; € R_;. We will prove that 
F, C @ by induction on 0<i<n. The case Fy = QC @ follows because @ is a 
subfield of C. Now suppose that Fi; C @. Then a; € F,_; is constructible, which 
implies \/a; € @ by Theorem 10.1.4. Thus F = F)-1(./au) C @, as claimed. This 
shows that F,, C @’, so that in particular, any a € F, is constructible. 

Conversely, given a € @, we need to create successive quadratic extensions that 
start from Q and eventually contain a. We will prove that there are extensions 
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Q=FyCc---CF, CC where [F;:F_,| =2 such that F, contains the real and 
imaginary parts of all numbers constructed during the course of constructing a. The 
theorem will follow, since a = a+ ib will imply that a,b € F,, so that a € F,,(i). 

We will prove this by induction on the number N of times we use PI, P2, or 
P3 in the construction of a. When N = 0, we must have a = 0 or 1, in which 
case we let F, = Fy = Q. Now suppose that a is constructed in N > 1 steps, 
where the last step uses P1, the intersection of distinct lines £; and @;. But then 
£, was constructed from distinct points a, and 6, using Cl, and similarly 2, was 
constructed from distinct points a2 and 82. By our inductive assumption, there are 
extensions Q = Fy C --- C F, C C where [F; : F_1| = 2 such that F, contains the real 
and imaginary parts of a ,,8,,02,62. We will prove that F, contains the real and 
imaginary parts of a. 

The line £; has an equation of the form a,x + by = c, and goes through a; # /). 
Since the real and imaginary parts of a, @; lie in F,, Exercise 4 implies that we can 
assume that the coefficients a,,b;,c; lie in F,. Similarly, £2 has an equation of the 
form a2x + b2y = cz where a3,b2,c2 € F,. Hence the real and imaginary parts of a 
give the unique solution of the equations 


axtby=c, 


anx + bry = 2. 


In this situation, linear algebra tells us that the matrix 


aj by 
a2 bz 


is invertible (be sure you can explain why), so that the unique solution is 


G)=G 8) @) 
y} \ar bo C2) ° 
It follows immediately that the real and imaginary parts of o lie in F,,. 

Next suppose that the last step in the construction of a uses P2, the intersection of 
aline £ andacircle C. Thus £ is the line through a, # 8; (fromC1), and C is the circle 
with center yz and radius |a2 — 2| (from C2). The five points a, 8,02, 82,72 come 
from earlier steps in the construction, so that by our inductive assumption, there are 
extensions Q = Fo C --- C Fy C C where [F;: F;;] = 2 such that F, contains the real 
and imaginary parts of these five points. We will show that the real and imaginary 


parts of a lie in F, or in a quadratic extension of F,. 
As above, £ is given by an equation 


(10.2) ayx+by=c, 
where a, b,,c; € F,. In Exercise 4 you will show that C is given by an equation 


(10.3) x+y? +anx+ boy +c. =0, 
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where a2,b2,c2. € F,. Now suppose that a; #0. Then dividing (10.2) by a; and 
relabeling, we can assume that the line @ is given by x + biy =c,. Substituting 
x = —b iy +c, into (10.3) gives the quadratic equation 


(—biyt+e1) +y? +a2(—biy +e1) + boy +2 = 0. 


By the quadratic formula, the values of y involve the square root of an expression in 
F,,. If this lies in F,, then so do y and x = —b,y + c\, and it follows that the real and 
imaginary parts of a lie in F,. On the other hand, if this square root does not lie in F,,, 
then it lies in a quadratic extension F, C F,41. Then y and x = —b,y+c, also lie in 
F,41, which shows that the real and imaginary parts of a lie in a quadratic extension 
of F,. When a, = 0, the argument is similar and is left as part of Exercise 4. 

Finally, suppose that the last step in the construction of a uses P3, the intersection 
of distinct circles C, and C. As above, we can find Q = Fo C --- C F, C C where 
[F; : F;_1] = 2 such that the circles are given by equations 


x+y +ayxtby+e, =0, 


(10.4) 
x? +y* +ayxt+tbhoy+c. =0, 
where all of the coefficients lie in F,. Furthermore, we know that the real and 
imaginary parts of a give a solution of (10.4). 
If we subtract these equations, we get the equation 


(10.5) (a, — a2)x+ (by — b2)y+ (c) —€2) =0. 


Since the circles (10.4) are distinct but not disjoint, one easily sees that the coeffi- 
cients of x and y in (10.5) don’t vanish simultaneously. Thus (10.5) defines a line. 
Furthermore, if we combine this equation with the first equation of (10.4), then we 
are in the previous case of the intersection of a circle and a line. We conclude that 
the real and imaginary parts of o lie in F,, or in a quadratic extension of F,. This 
completes the proof. 2 


Corollary 10.1.7 @ is the smallest subfield of C that is closed under the operation 
of taking square roots. 


Proof: By Theorem 10.1.4, we know that a € @ implies that ,/a € @. Now let F 
be any subfield of C closed under taking square roots, and suppose that a € @. By 
Theorem 10.1.6, we have Q = Fy C--- CF, C C where [F; : F;;] = 2 anda € F,. The 
first paragraph of the proof of Theorem 10.1.6 shows that F,, C F. Thus a € F, C F, 
and @ C F follows as desired. . 


Theorem 10.1.6 also has the following useful consequence. 
Corollary 10.1.8 If a € @, then [Q(a):Q] = 2” for some m>0. Thus every 


constructible number is algebraic over Q, and the degree of its minimal polynomial 
over Q is a power of 2. 
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Proof: If a€ @, then Theorem 10.1.6 gives extensions Q=Fc---CR,cC 
where |F;: F_,] = 2 and a € F,. Hence 


[Fn : Q)] = [Fn : Fo] = [Fn : Fn—i] +++ [Fi : Fo] = 2" 


by the Tower Theorem. However, we also have Q c Q(a) C F,. Using the Tower 
Theorem again, we conclude that [Q(a) : Q] divides [F, : Q] = 2”. . 


Some of the most famous problems in Greek geometry are trisection of the angle, 
duplication of the cube, and squaring the circle. Using Corollary 10.1.8, we can solve 
these as follows. 


Example 10.1.9 Trisection of the Angle. We know that every angle can be bisected 
using straightedge and compass. We will prove that this is not true for trisections, i.e., 
there exist angles that cannot be trisected by straightedge and compass. Suppose, for 
instance, that we could trisect a 120° angle in this way. Since we can construct a 120° 
angle from 0 and | by straightedge and compass (see Exercise 5), a trisection of this 
angle would imply that we could construct a 40° angle from 0 and 1 by straightedge 
and compass. Intersecting this with the unit circle centered at the origin, it would 
follow that the 9th root of unity ¢, = e?7'/9 would be a constructible number (since 
40° = 27/9 radians). 

However, Theorem 9.1.9 implies that the minimal polynomial of ¢, is the cyclo- 
tomic polynomial ®9(x), and the factorization 


x9 — 1] = ®;(x)@3(x) Bo(x) = (x — 1) (x7 +. 41) (x +27 +1) 


from Proposition 9.1.5 shows that x® +x + 1 is the minimal polynomial of ¢,. By 
Corollary 10.1.8, Gy is not constructible. This contradiction proves that we cannot 
trisect 120° using straightedge and compass. In Exercise 6 you will show that it is 
also impossible to trisect 60° by straightedge and compass. 

In Section 10.2, we will use the results of Section 9.1 to determine all » for which 
¢, is constructible. <> 


Example 10.1.10 Duplication of the Cube. Here, the problem is to take a given 
cube and construct one with exactly twice the volume. We can pick our units of 
measurement so that the given cube has edges of length 1. In these units, the volume 
is also 1, which means that we need to construct a cube of volume 2. Since volume 
is edge length cubed, it follows that if we could duplicate the cube, then we could 
construct a number s such that s? = 2, i.e., s = /2. Furthermore, since the cube 
has edge length 1, we can assume that the construction begins with 0 and 1. It 
follows that duplicating the cube by straightedge and compass implies that s = v2 
is constructible. But x° — 2 is the minimal polynomial of v2 over Q, so that V2 
is not constructible, by Corollary 10.1.8. This contradiction proves that we cannot 
duplicate the cube by straightedge and compass. <> 


Example 10.1.11 Squaring the Circle. This is the problem of constructing a square 
whose area is equal to that of a given circle. If we pick our units of measurement so 
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that the given circle has radius 1, then the circle has area 7. Since a square of area 
a has side \/7, it follows that if we could square the circle, then we could construct 
VT. Furthermore, since the circle has radius 1, we can assume that the construction 
begins with 0 and 1. It follows that squaring the circle by straightedge and compass 
would imply that \/7 is constructible. 

Since % is a field, the constructibility of ./7 would imply that 7 = Vr is 
also constructible. Then Corollary 10.1.8 would imply that 7 is algebraic over 
Q. However, in 1882 Lindemann proved that 7 is transcendental over Q. A self- 
contained proof can be found in [Hadlock, Sec. 1.7]. This contradiction shows that 
we cannot square the circle by straightedge and compass. << 


One could also ask whether the converse of Corollary 10.1.8 is true. In other 
words, if a € C is algebraic over Q and the degree of its minimal polynomial is a 
power of 2, is a constructible? The following result will answer this question. 


Theorem 10.1.12 Leta € C be algebraic over Q, and let Q C L be the splitting field 
of the minimal polynomial of a over Q. Then a is constructible if and only if |L: Q| 
is a power of 2. 


Proof: First suppose that [L:Q] is a power of 2. We will follow the proof of 
the Fundamental Theorem of Algebra given in Section 8.5. Since Q C L is Galois, 
it follows that |Gal(L/Q)| = [L:Q] is a power of 2, say |Gal(L/Q)| = 2”. By 
Theorem 8.1.7, Gal(L/Q) is solvable, which by Definition 8.1.1 means that we have 
subgroups 

{e} = Gp C Gn—) C +++ C Gy C Gp = Gal(L/Q) 


such that G; is normal in G;_ of index 2 (since |Gal(L/Q)| = 2”). This gives 
Q=Le, C Le, C-:: CLe, =L, 


where [Lg,:Lc,_,] = 2 for all i. By Theorem 10.1.6, a € L is constructible. 

Turning to the converse, we first show that Q C @ is a normal extension. For this, 
let a € @, and let f be the minimal polynomial of a over Q. We need to prove that f 
splits completely over &@. Since a is constructible, Theorem 10.1.6 gives extensions 
Q=hHC::-CF, CC where [F;: Fj_-;] =2 and a € F,. Then let Q C M be the 
Galois closure of Q C Fy, as constructed in Proposition 7.1.7. In Exercise 7 you will 
show that we may assume that M c C. 

Note that f splits completely in M, since M is normal over Q, f is irreducible 
over Q, and a € F, C M is a root of f. Now let 8 € M be any root of f. By 
Proposition 5.1.8, there is a € Gal(M/Q) such that o(a) = 8. Applying a to the 
fields Q = Fy C--- C F, CM gives 


Q = 9(Q) =9(Fo) C+ Con) 


such that [o(F;):0(Fj-1)] = [Ai: F-1] = 2 for all i. By Theorem 10.1.6, 8 = a(a) € 
o(F,) is constructible. This shows that f splits completely over @. 

It follows that @ contains a splitting field L of f over Q. By the Theorem of the 
Primitive Element, we have L = Q(y) for some y € L. Since y € @, Corollary 10.1.8 
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implies that [Q(y) : Q] = [L: Q] is a power of 2, as claimed. This completes the proof 
of the theorem. LT] 


We can use Theorem 10.1.12 to show that the converse of Corollary 10.1.8 is false. 
Here is an example. 


Example 10.1.13 Let @ be a root of the polynomial 
f=xt—4e 4x41. 


One easily checks that f is irreducible over Q, so that [Q(a): Q] = 4. However, in 
Chapter 13 we will show that the splitting field L of f over Q satisfies [L: Q] = 
By Theorem 10.1.12 we conclude that a is not constructible. <p> 


Mathematical Notes 
There are several issues that are worthy of further comment. 


= Starting Configurations. According to Definition 10.1.1, a constructible number 
is constructed by a sequence of constructions that always begins with 0 and 1. It is 
possible to begin constructions with different starting configurations. For example, 


three noncollinear points a, 8, determine an angle with vertex @ and rays af and 
ay, This angle can be bisected by straightedge and compass, even though a, 6, 
need not be constructible. 

We will not develop the theory of such constructions beyond the comment that 
the trisection of the angle is most naturally stated in this context: given an angle 
determined by a, {,‘y as above, one seeks a construction that trisects this angle by 
straightedge and compass. In Example 10.1.9 we showed that this is impossible by 
finding a particular case of a, 8,y to which we could apply Corollary 10.1.8. 


=» Compasses. In C2, the compass uses points a # £ to give the radius, with the 
center given by a third point y. This is slightly different from what Euclid does, for 
he uses the compass with points a # § where the center is either a or 8. One can 
prove that this more restrictive notion of compass (called the Euclidean compass in 
Martin’s book [15]) gives the same set of constructible points. 

More surprising is the fact that we can dispense with the straightedge entirely. 
The Mohr—Mascheroni Theorem states that a € C is constructible if and only if there 
is a sequence of Euclidean compass constructions that starts with 0 and 1 and ends 
with a. A proof can be found in [15, Ch. 3]. 


= Straightedge and Dividers. A set of dividers is a tool that can copy line segments. 
In other words, given points a 4 @ and a point y on a line J, dividers allow us to 
construct points 6) ,62 € / such that the distance from 6, or 62 to y equals the distance 
from @ to £, as in the following picture: 


6 i, 


a 62 
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Let # denote the set of real numbers that can be obtained from 0, 1, and i 
by a sequence of straightedge-and-dividers constructions. By [15, Thm. 5.6], 4 
is a subfield of R. A more interesting property of # is that if a,b ¢ #, then 
Va? + b* € #. To prove this, note that the y-axis is constructible using 0, i, and the 
straightedge, so that given b € #, we can construct ib using our dividers. Combining 
this with a € #, we get the diagram 


ib 


a 


Now use the dividers to transfer the line segment from a and ib to the positive x-axis, 
starting from 0. The Pythagorean Theorem implies that a? + b? € #, as claimed. 
In general, a subfield of R that contains Va? + b? whenever it contains a and b is 
called Pythagorean. Thus @ is Pythagorean, and in Exercise 8 you will show that 
is the smallest Pythagorean subfield of R. This is an analog of Corollary 10.1.7. 
We call # the field of Pythagorean numbers. 
The most interesting result about @ is the following analog of Theorem 10.1.12. 


Theorem 10.1.14 Let a € R be algebraic over Q, and let f be the minimal polyno- 
mial of a over Q with splitting field L. Then the following are equivalent: 

(a) ac F. 

(b) All roots of f are real, and a is constructible. 

(c) All roots of f are real, and {L:Q) is a power of 2. 


Proof: The equivalence (a) } (b) is proved by Auckly and Cleveland in [3, p. 225]. 
Then (b) = (c) follows from Theorem 10.1.12. a 


For those who read the discussion of the casus irreducibilis in Section 8.5, we 
note the following corollary of Theorems 10.1.14 and 8.6.5. 


Theorem 10.1.15 Let a € R be algebraic over Q. Then a € & if and only if a is 
expressible by real radicals and all conjugates of a over Q are real. 2 


This shows an unexpected relation between geometric constructions and solvabil- 
ity by radicals. 

Numbers in # are constructed using straightedge and dividers. Since a compass 
can be used as a pair of dividers, one sees easily that 


PCR. 


Using Theorem 10.1.14, we prove that these fields are not equal as follows. 
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Example 10.1.16 Consider 


a= /2+2Vv2. 


Then a € @ NR, since @ is closed under taking square roots. However, the minimal 
polynomial of a is 
f=x'—4°-4, 


which has roots +\/2+2,/2. Two of these roots are not real, so that a ¢ # by 
Theorem 10.1.14. <> 


= Marked Rulers. The straightedge we’ve been using is an unmarked ruler. But 
suppose instead that we have a marked ruler, which is a straightedge with two marks 
on it one unit apart. Such a ruler allows the construction of some interesting lines 
and points. Provided one starts from the points 0,1,i in C, one can prove that all 
straightedge-and-compass constructions can be done using only a marked ruler. We 
will see in Section 10.3 that there are also marked-ruler constructions for trisecting 
angles and taking cube roots. 


Historical Notes 


There are two versions of where the problem of duplicating the cube first arose. 
In one version, King Minos was unhappy with the cubical tomb of his son Glaucus 
and ordered its size doubled. In the other, a delegation from Athens asked the oracle 
at Delos for advice about a plague in Athens. The Athenians were told to double the 
size of the cubical altar of Apollo, In both versions, the first attempt was to double 
the sides, which multiplies the volume by a factor of eight. The point, of course, is 
that what was required was to double the volume, which means that the sides must 
be multiplied by \/2. Because of the popularity of the second version of the story, 
the duplication of the cube is sometimes called the Delian problem. 

Less is known about the origin of the other two Greek problems. It may have 
been something like the following. A line segment is easily bisected or trisected by 
straightedge and compass, and bisecting angles is equally easy. Hence it is natural 
to ask about trisecting angles. Similarly, a rectangle or triangle is easily squared by 
straightedge and compass. Since the next most basic geometric figure is a circle, it is 
natural to ask whether it is also squarable. 

Greek geometers never solved these problems, though the search for solutions led 
to some wonderful mathematics. For example, consider the lune indicated by the 
shaded region: 
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Around 440 B.c., Hippocrates of Chios discovered this lune in the course of his work 
on squaring the circle. The above figure has a quarter circle with radius OA = OB 
and a semicircle with diameter AB. The lune is the region outside the quarter circle 
but inside the semicircle. Hippocrates showed that the shaded region has the same 
area as the triangle AOB (you will verify this in Exercise 9). It follows that this lune 
is squarable, so that at least part of a circle is squarable. Hippocrates found other 
squarable lunes, and in the twentieth century, Chebotarev and Dorodnov showed that 
there are only five squarable lunes. A proof can be found in [18]. See also [20]. 

Somewhat later, Hippias of Elis (ca. 425 B.C.) discovered a curve called the 
quadratrix, which he used to trisect angles and square the circle. In modern terms, 
this curve is given by 


(10.6) y= xeot (5). 


In Exercise 10 you will use this curve to trisect angles and square the circle. As 
for duplication of the cube, Menaechmus introduced parabolas around 350 B.C. as a 
by-product of his work on duplication. In Exercise 11 you will show that duplication 
of the cube can be solved by intersecting two parabolas. We will see in Section 10.3 
that many other constructions can be done by intersecting conic sections. 

There is a lot more to say about Greek work on these problems. For example, 
the spiral of Archimedes r = @ has some nice applications that you will study in 
Exercise 12. Numerous other examples can be found in Chapter 4 of [9]. Greek 
geometry is more interesting than what you learned in high school, in part because the 
Greeks had these great problems to inspire them. In modern mathematics, unsolved 
problems play a similar role of inspiring research. For example, the inverse Galois 
problem mentioned in Section 7.4 is still unsolved and is being actively studied by 
many researchers. 

The three Greek problems were solved in the nineteenth century. In 1837 Wantzel 
showed that duplication of the cube and trisection of the angle cannot be done by 
straightedge and compass. His argument used the irreducibility of certain cubic 
polynomials. This is similar to what we did in Exercise 6 (for trisection of the angle) 
and Example 10.1.10 (for duplication of the cube). The first page of Wantzel’s paper 
is reproduced on page 84 of [Escofier]. Finally, as noted in Example 10.1.11, the 
problem of squaring the circle was solved in 1882 when Lindemann showed that 7 
is transcendental over Q. 

We should also note that the variants on straightedge-and-compass constructions 
mentioned in the Mathematical Notes have a long and interesting history. This is 
discussed in [15]. See also the Historical Notes to Section 10.3. 


Exercises for Section 10.1 


Exercise 1. In part (a) of Example 10.1.2 we constructed the x-axis. In a similar way show 
that the y-axis is constructible. For each step in your construction be sure to say which of C1, 
C2, P1, P2, and P3 you are using. 
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Exercise 2, Suppose that a, 3, y are noncollinear and consider the rays ap and ary emanating 
from @ that go through £ and ¥ respectively. We call this the angle formed by a, 8,7. Also 
assume that a, 3,-y are constructible. 

(a) Prove that there is a constructible number 6 with positive y-coordinate such that the angle 
formed by a, 3,-y is congruent to the angle formed by 0, 1,6. As in Exercise 1, each step 
in the construction should be justified by C1, C2, P1, P2, or P3. 

(b) Prove the claim made in Example 10.1.3 that ¢, = 2'/" is constructible if and only if a 
regular n-gon can be constructed by straightedge and compass. 


Exercise 3. This exercise covers the details omitted in the proof of Theorem 10.1.4. 

(a) Let a, 8 be constructible numbers such that 0,a, are collinear. Prove that a+ 6 is 
constructible. 

(b) Leta © @N{x EC R| x > 0}. Use Figure 2 in the proof of Theorem 10.1.4 to show that 
1/a is constructible. 

(c) In the proof of Theorem 10.1.4, we showed that @M {x € R | x > 0} is closed under 
addition, multiplication, and multiplicative inverses. Use this to prove that @ OR is a 
subfield of R. 

(d) Prove that the number @ pictured in (10.1) is constructible (assuming that r is con- 
structible). 


Exercise 4. This exercise covers the details omitted in the proof of Theorem 10.1.6. 

(a) Suppose that a line 2; goes through distinct points a; = 4; +iv; and 6; = uz +iv2, where 
41, V1, 42, V2 lie in a subfield F C R. Prove that 2; is defined by an equation of the form 
aix+ bry = cy; where a),b1,c1 € F. 

(b) Suppose that a2 # 2 and -y2 are complex numbers whose real and imaginary parts lie 
in a subfield F C R. Prove that the circle C with center -y, and radius |a2 — 82| has an 
equation of the form (10.3) with a2,b2,c2 € F. 

(c) In the proof of Theorem 10.1.6, we considered the equations (10.2) and (10.3) when 
a, #0. Explain what happens when a; = 0 in (10.2). 


Exercise 5. In this exercise you will give two proofs that ¢, = e’™'/3 is constructible. 


(a) Give a direct geometric construction of ¢, with each step justified by citing C1, C2, P1, 
P2, or P3. 
(b) Use Theorem 10.1.6 to show that ¢, is constructible. 


Exercise 6. Show that it is impossible to trisect a 60° angle by straightedge and compass. 


Exercise 7. Suppose we have extensions Q C F C C where [F : Q] is finite. Prove that there 
is a field M such that F C MC C and M is a Galois closure of F over Q. 


Exercise 8. In the Mathematical Notes we defined the field # C R and what it means for a 

subfield F C R to be Pythagorean. 

(a) Let a be a real number. Prove that a € # if and only if there is a sequence of fields 
Q=foC---CF, CR such that a € F,, and for i= 1,...,n there are a;,b; € F;-; such 
that Fr = Fi (Va? +27). 


(b) Prove that is the smallest Pythagorean subfield of R. 


Exercise 9. Show that the lune illustrated in the Historical Notes has the same area as the 
triangle AOB in the illustration. 


Exercise 10. The quadratrix is the curve y = xcot(7x/2) for0 <x <1. In this problem, you 
will use this curve to square the circle and trisect the angle. 
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(a) Show that 2/7 = lim,_,)+ xcot(x/2), i.e., the quadratrix meets the y axis at y = 2/7. 
We will follow Hippias and include this point in the curve. 


(b) Show that we can square the circle starting from 0 and 1 and constructing new points 
using Cl, C2, P1, P2, or P3, together with the intersections of constructible lines with the 
quadratrix. 


(c) A point (a,b) on the quadratrix determines an angle @ as pictured below. Prove that 
6=1a/2. 


2/0 


(a,b) 


(d) Suppose that we are given an angle 0 < 6 < 2/2. Prove that we can trisect @ starting 
from 0, 1, and @ and constructing new points using C1, C2, Pl, P2, or P3, together with 
the intersections of constructible lines with the quadratrix. 


(e) Explain how the method of part (d) can be adapted to trisect arbitrary angles. 
(f) Using the quadratrix, what else can you do to angles besides trisecting them? 


Exercise 11. Explain how the points of intersection of the parabolas y = x” and x = 2y” enable 
one to duplicate the cube. Your explanation should include a picture. 


Exercise 12. The spiral of Archimedes is the curve whose polar equation is r = 0: 


(a) Explain how the spiral and 6 = 1/2 enable one to square the circle. 
(b) Given an angle @, explain how the spiral enables one to trisect . 
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10.2 REGULAR POLYGONS AND ROOTS OF UNITY 


Our next task is to apply the theory developed in Section 10.1 to the question of 
which regular polygons can be constructed by straightedge and compass. Our main 
tool will be the cyclotomic extension Q C Q(¢,) studied in Chapter 9. 

Before stating our main result, we need some terminology: An odd prime p is a 
Fermat prime if it can be written in the form 


p=2 41 


for some integer m > 0. Following Gauss, we can now characterize constructible 
regular polygons as follows. 


Theorem 10.2.1 Let n > 2 be an integer. Then a regular n-gon can be constructed 
by straightedge and compass if and only if 


n=2°p\--- Pr, 
where s > 0 is an integer and p,,...,p, are r > 0 distinct Fermat primes. 


Proof: In Example 10.1.3 we saw that a regular n-gon is constructible by straight- 
edge and compass if and only if ¢, is constructible. Using results proved earlier, we 
can determine when ¢, is constructible as follows: 
© By (8.6), Qc Q(¢,) is a Galois extension. 
e By Theorem 10.1.12, it follows that ¢, is constructible if and only if [Q(¢,) : Q] is 
a power of 2. 
e By Corollary 9.1.10, [Q(¢,): Q] = , where ¢(n) is the Euler ¢-function de- 
fined in Section 9.1. 
We conclude that ¢, is constructible if and only if @(n) is a power of 2. 
First suppose that n = 2°p,---p,, where p),...,p, are distinct Fermat primes. 
Then part (b) of Lemma 9.1.1 gives the formula 


= —1y)_ J2°'(pi-1)--(pr-1), 5 >, 
a) —nTT( a ane ,—1), s=0. 


It follows that @(n) is a power of 2, since each p; is a Fermat prime. 

Conversely, suppose that (7m) is a power of 2, and let the factorization of n be 
n=qj'---q@, where qi,...,qs are distinct primes and the exponents a,...,q@s are all 
> 1. Then part (b) of Lemma 9.1.1 gives the formula 


If qg; is odd, then we must have a; = 1, since ¢(n) is a power of 2, and we also 
conclude that g; — 1 must be a power of 2. However, in Exercise 1 you will show 
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Because of Euler’s negative result, there was little interest in Fermat primes until 
Gauss discovered their relation to the constructibility of regular polygons. The first 
entry in his famous mathematical diary, dated March 30, 1796, reads as follows: 

The principles upon which the division of the circle depend, and geometrical 
divisibility of the same into seventeen parts, etc. 
(See [11].) Gauss wrote this one month before his 19th birthday. 

The details of what Gauss proved about regular polygons appear in Section VII 
of Disquisitiones Arithmeticae [10]. As explained in Section 9.2, Gauss studied the 
equations satisfied by periods (special primitive elements of intermediate fields) of 
the extension Q C Q(¢,), where p is prime. Then, in Article 365 of [10], he applies 
his results to show that ¢, is constructible when p is a Fermat prime. Though he 
asserts that the converse is true, the first published proof of this is due to Wantzel in 
1837. In Article 366 Gauss describes which ¢, are constructible when n is arbitrary 
(Theorem 10.2.1), though his proof is again incomplete. 

Gauss knew that a straightedge-and-compass construction of a regular 17-gon was 
a big deal. As he says in Article 365 of [10]: 

It is certainly astonishing that although the geometric divisibility of the circle 
into three and five parts was already known in Euclid’s time, nothing was added 
to this discovery for 2000 years. 


But rather than give an explicit construction, Gauss shows that 


cos(2m/17) = — e+ ge V17 + 6 V34-2V17 


+4 1743V17— V34—2V17 — 2344 2V17. 


Exercises 8 and 9 of Section 9.2 explain how this formula follows from Gauss’s 
theory of periods. From here one can design a construction for the regular 17-gon, 
though it is not very efficient. A more elegant construction can be found in [Stewart, 
Ch. 17], along with a reference for Richelot’s construction of the regular 257-gon 
in 1832. There is also the story of Professor Hermes of Lingen, who late in the 
nineteenth century worked 10 years on the construction of the regular 65537-gon. 

We conclude with some remarks about arc length. This was an important topic 
in the seventeenth and eighteenth centuries. For example, by inscribing a regular 
n-gon in the unit circle, one easily sees that constructing the n-gon by straightedge 
and compass is equivalent to dividing a circle into n equal arcs by straightedge and 
compass. Another example involves the lemniscate, which is the curve in the plane 
defined by the polar equation r* = cos26: 
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In 1716 Fagnano discovered a method for doubling and halving an arc of the lem- 


niscate. In particular, he showed that the circle of radius ¥//2 — 1 (drawn in dashes 
in the illustration at the bottom of the previous page) divides each quadrant of the 
lemniscate into arcs of equal length. Hence the lemniscate can be divided into eight 
equal arcs by straightedge and compass. In Chapter 15 we will explore a remarkable 
generalization of this discovered by Abel. 


Exercises for Section 10.2 


Exercise 1. Suppose that 2* + 1 is an odd prime. Prove that k is a power of 2. 


Exercise 2. Let p be prime. In Example 9.1.6, we showed that 
® ,2(x) = PP“) pplP—2) gy PP 4], 


The goal of this exercise is to prove that ®,2(x) is irreducible over Q using only the 
Schonemann-Eisenstein criterion. 
(a) Explain how the formulas of Example 9.1.6 imply that 


(xt 1)” —1 = ((x +1)? -1)6,2(e4 1). 
(b) Let on (x+ 1) be the reduction of ®,2(x+ 1) modulo p. Show that 
rl =x @2(x+1). 


(c) Show that ®,2(x+ 1) is irreducible over Q by the Schénemann-Eisenstein criterion. As 
in the proof of Proposition 4.2.5, this will imply that the same is true for ®,2 (x). 


Exercise 3. Using only Proposition 4.2.5, Theorem 10.1.12, and Exercise 1, show that ¢ ’ is 
constructible if and only if p is a Fermat prime. 


Exercise 4. Prove that . 

(6) " = Sn 
when m|n, m > 0, and use this to conclude that if ¢, is constructible and m|n, m > 0, then ¢,, 
is constructible. 


Exercise 5. Suppose that 1 = 2°p,---p,, where pi,...,p, are distinct Fermat primes. Then 
Cp; is constructible by Exercise 3. 

(a) Show that ¢,; is constructible. 

(b) Assume that ¢,,¢, are constructible and gcd(a, b) = 1. Prove that ¢,, is constructible. 

(c) Conclude that ¢, is constructible, since ¢,s,¢p,,---+Cp, are. 


Exercise 6. Now suppose that ¢, is constructible for some n > 2. The goal of this exercise 
is to prove that if p is an odd prime dividing n, then p is a Fermat prime and p’{n. This 
and Exercise 5 will give a proof of Theorem 10.2.1 that doesn’t require knowing that ®,,(x) is 
irreducible for arbitrary 7. 
(a) Let p be an odd prime dividing n. Use Exercises 3 and 4 to show that p is a Fermat prime. 
(b) Now assume that p is an odd prime and p?|n. Use Exercise 4 to show that ¢,2 is 
constructible. Then use Theorem 10.1.12 and Exercise 2 to obtain a contradiction. 
In Chapter 15 we will use a similar strategy to prove Abel’s theorem about straightedge-and- 
compass constructions on the lemniscate. 
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Exercise 7. Prove that 3, 5, 17, 257, and 65537 are Fermat primes. 


Exercise 8. Use log,)(F33) © 2™log,,(2) to estimate the number of digits in the decimal 
expansion of F33. Then do the same for F2478782. 


10.3. ORIGAMI (OPTIONAL) 


In this optional section, we will use origami—the art of Japanese paper folding—to 
do some constructions not possible by straightedge and compass. We will also give 
a careful description of origami numbers and explain what they mean from the point 
of view of Galois theory. 


A. Origami Constructions. We begin with a classic origami construction that 
shows how to trisect an angle. Take an arbitrary angle 9 between 7/4 and 7/2, and 
put it in the bottom left corner of a square sheet of paper. This gives the picture on 
the left: 


Thus @ is the angle between the line /, and the bottom of the sheet. Then, as indicated 
on the left, fold the sheet twice to obtain two lines parallel to the bottom such that 
the line /; is equidistant to the parallel lines through the points P; and P). 

Now, turning to the picture on the right, do a classic origami move that folds the 
sheet so that P; moves to a point Q; on J, and P, moves to a point Q2 on lh. You 
should try this on a sheet of paper (a rectangular sheet will work fine). In Exercise 1 
you will prove that the angle made by the bottom and the segment P,Q, is 6/3. Thus 
we have trisected an arbitrary angle 7/4 < 9 < 1/2 using origami! 

Origami also makes it easy to double or halve angles. From this and the above 
construction, it follows that one can trisect any angle using origami (see Exercise 2). 

We can also solve cubic equations using origami. But before explaining this, we 
need to think about the underlying geometry of the trisection given in (10.7). The 
surprise is that we are dealing with simultaneous tangents to parabolas. To see how 
this works, consider the geometric description of a parabola, which is defined as the 
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locus of all points P equidistant from a fixed point P, (the focus) and a fixed line J, 
(the directrix). This gives the following picture: 


(10.8) 


In this picture, the segments P, P and PQ, have equal length, and PQ, is perpendicular 
to the directrix /,. The key point, which you will prove in Exercise 3, is that Q) is the 
reflection of P; about the tangent line at P (the dashed line in the picture). You will 
also prove the converse. Thus we have the following result. 


Lemma 10.3.1 Jn the plane, let P; be a point not ona line l,. Then, given another 
line £, the reflection of P, about £ lies on |, if and only if £ is tangent to the parabola 
with focus P, and directrix |. rT 


To see how this relates to origami, look back at (10.7). The origami move we used 
took P; to Q; € J; and P; to Q2 € l, and was done by folding along the dashed line. 
This means that the reflection of P; about the dashed line lies on /,, so that the dashed 
line is tangent to the parabola with focus P; and directrix /; by Lemma 10.3.1. The 
same argument shows that the dashed line is also tangent to the parabola with focus 
P, and directrix ,. We conclude that using origami, one can find the simultaneous 
tangents to two given parabolas. 

Here is an example of how to use this. 


Example 10.3.2 Let us find the real roots of the cubic equation x? + ax +b =0, 
where a,b € R and b ¥ 0. Following the paper [1], we consider the parabolas 


(10.9) (y— ja)? =2bx and y= 3x7. 


Let £ be a line with slope m that is simultaneously tangent to these parabolas, say 
at points (x;,y1} on the first and (x2,y2) on the second. In Exercise 4 you will use 
calculus to show that the slope of the tangent line to the first parabola at (x,y) is 


b 


m= —_. 
yi— 34 
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This implies that m 4 0 and y; — sa = 2 from which we easily conclude that 


(10. 10) 2b 2b 2m? , 
ba 
y m 2 


Computing the slope of the tangent line to the second parabola at (x2, y2) gives 


(10.11) x2 =m, 
which easily implies that 
m2 
y2 = 2 


If we substitute these values into m = (y2 — y1)/(%2 — x1), then we obtain 


m b a 4 2 

mw et. EA t5) _ om —2bm—am 

x — 2 7 2m3—-b 
*2—¥1 Int 


Since m # 0, it follows without difficulty that m satisfies the equation 
(10.12) m +am+b=0. 


Hence the slopes of the simultaneous tangents to the parabolas (10.9) are roots of the 
cubic m? + am + b. In Exercise 5 you will do this using origami. <p> 


B. Origami Numbers. Our next task is to give a careful description of the numbers 
we get when we add the origami move used in (10.7) to the constructions C1 and C2 
defined in Section 10.1. More precisely, consider the following origami construction: 


C3. From a; # @2 not lying on lines ¢; 4 £2, we can draw a line @ that reflects a to 
a point on é; and a2 to a point on £5. 


The dashed line in (10.7) is an example of C3. There are situations where no line @ 
satisfies the conditions of C3 (see Exercise 6). Hence what C3 really says is that we 
are allowed to use such a line 2 whenever it exists. 

By Lemma 10.3.1, C3 enables us to draw a simultaneous tangent to two given 
parabolas (assuming there is such a tangent). Notice that C3 constructs only the line 
£. This is because in origami, a line is a fold and a point is an intersection of folds. 
Of course, once we have @, we can construct the reflections of a; and a2 about @ by 
further straightedge-and-compass constructions. 

The constructions C1, C2, and C3 create circles and lines, and intersecting them 
using P1, P2, and P3 from Section 10.1 gives new points that can be used for further 
constructions. We define origami numbers as follows. 


Definition 10.3.3 A complex number a is an origami number if there is a finite 
sequence of constructions using C1, C2, C3, P1, P2, and P3 that begins with 0 and 1 
and ends with a. 
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This definition appears to involve compass, straightedge, and origami. However, 
in Chapter 10 of Martin’s book [15], it is shown that all straightedge-and-compass 
constructions can be done using origami (called “paperfolding” by Martin). In 
particular, one can replace C1, C2, C3, P1, P2, and P3 with constructions that involve 
only origami and give the same set of origami numbers. 

The set of all origami numbers has the following structure. 


Theorem 10.3.4 The set @ = {a € C | a is an origami number} is a subfield of C. 
Furthermore: 


(a) Leta =a+ib, where a,b € R. Thena€ @ if and only ifa,be @. 
(b) a € @ implies that /a,/aé 6. 


(c) A complex number o lies in C if and only if there are subfields 
Q=KHCFC-:-Ch-1.Ckh,cC 


such that a € F, and [F;: F;-,] = 2 or3 forl<i<n. 


Proof: We refer to [15, Ch. 10] for the proof that @ is a subfield of C. The proof 
of part (a) is similar to what we did in Theorem 10.1.4 and is omitted. 

To prove part (b), write a in polar form as a = re’. We may assume r > 0. Using 
the compass, we can transfer r to the x-axis, and then the straightedge-and-compass 
construction given in (10.1) shows that ./r € @. Since we can also bisect 0 by 
straightedge and compass, it follows that 


Va=+tyre'®/? € @. 


For the cube root, we can trisect using (10.7) and Exercise 2. To construct ¥/r, 
consider the parabolas (10.9) with a = 0 and b = —r. By Exercise 7 the foci a;,a2 
and directrices /,, /) of these parabolas are defined over any subfield of R containing 
r and hence can be constructed from r by straightedge and compass. Applying C3 
to a, @ and /;,/2, we can construct a simultaneous tangent £ to these parabolas. By 
(10.12), £ has slope m = \/r. This easily implies that ¢/r € @. Since w = e?/3 € @ 
(do you see why?), it follows that 


Ya =u VSre®? EC, i=0,1,2. 


In the proof of part (c) we will say that fields Q = Fp C--- C F, CC forma 2-3 
tower if [F;: F,_] = 2 or 3 for 1 <i <n (this differs slightly from the terminology 
used by Videla in [21]). Now suppose that Q = Fo C --- C F, is a 2-3 tower. We 
will prove that F, C @ by induction on n. Since the case n = 0 is obvious, we may 
assume that F,.,; C @. Given a € F,, we know that a is a root of a polynomial 
f € @|x] of degree at most 3, since [F, : F,-1] = 2 or 3. If f has degree 1, thena€ @ 
is immediate, and if f has degree 2 or 3, then, by the quadratic formula or Cardan’s 
formula, a can be expressed in terms of square roots, cube roots, and elements of @. 
By part (b), it follows that a € @, 
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Going the other way, let a be an origami number. We will show that there is a 2-3 
tower Q = Fy C--- C F, C C such that F,, contains the real and imaginary parts of all 
numbers constructed in the course of constructing a. The theorem will follow, since 
a = a+ ib will imply that a,b € F,, so that a € F,(i). (We used the same strategy in 
the proof of Theorem 10.1.6.) 

We will prove this by induction on the number N of times we use P1, P2, or P3 in 
the construction of a. First suppose that a is constructed in N > 1 steps and that the 
last step uses P1. Thus a is the intersection of distinct lines 2; and £2 created earlier 
in the construction. If both lines come from C1, then we are done, as in the proof of 
Theorem 10.1.6. However, if we used C3 to construct either of the lines, then more 
work is needed. 

If 2; was created using C3, then @; is simultaneously tangent to two parabolas 
whose foci and directrices were created earlier in the construction. We claim that @; 
has an equation whose coefficients lie in a 2-3 tower. To prove this, first consider the 
special case when the parabolas are of the form (10.9) for some a,b € R. Here, our 
inductive assumption and Exercise 7 imply that a, b lie in a 2-3 tower. Then the slope 
m of £, satisfies the cubic equation (10.12), so that we can extend the 2-3 tower to 
get one that contains a,b,m. By (10.10), the point (x1, 1) € 2; has coordinates in the 
2-3 tower. It follows that £, has an equation Ax + By = C whose coefficients lie in the 
same 2-3 tower. In the general case when £, was created using C3 for two arbitrary 
parabolas, one can argue similarly that £, has an equation whose coefficients lie in a 
2-3 tower. We omit the details. 

It follows that if £; or £2 (or both) were created using C3, then there is a 2-3 
tower containing the coefficients of their defining equations. As in the proof of 
Theorem 10.1.6, we conclude that the coordinates of the intersection of @; and £2 lie 
in the same tower. 

Next suppose that we use P2 to create a, so that a comes from the intersection of 
a circle and a line. By our inductive assumption and the above argument, we may 
assume that the circle and line are defined by equations whose coefficients lie in a 
2-3 tower, and then we are done by the argument of Theorem 10.1.6. 

Finally, when we use P3, the argument is identical to what we did in Theo- 
rem 10.1.6. This completes the proof of the theorem. a 


Here is a nice example of Theorem 10.3.4. 
Example 10.3.5 The 2-3 tower 
Qc Q(2cos(2n/7)) C QC) 


shows that ¢, is an origami number. It follows that a regular heptagon (7-gon) can be 
constructed by origami. <p 


We can also characterize origami numbers using Galois theory. 


Theorem 10.3.6 Let a € C be algebraic over Q and let Q C L be the splitting field 
of the minimal polynomial of a over Q. Then a is an origami number if and only if 
[L:Q] = 223° for some integers a,b > 0. 
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Proof: The argument is similar to the proof of Theorem 10.1.12. If a € @, one 
first proves that Q C @ is a normal extension, so that L C @. Then the formula for 
[L: Q| follows by applying Theorem 10.3.4 to a primitive element of Q C L. For the 
converse one uses Burnside’s p"q” Theorem (Theorem 8.1.8) to show that Gal(L/Q) 
is solvable because |Gal(L/Q)| = [L: Q] = 273°. The desired 2-3 tower is then easily 
constructed using the Galois correspondence and the definition of solvable group. 
We leave the details as Exercise 8. . 


Here are two examples of Theorem 10.3.6. 


Example 10.3.7 The results of Section 9.1 imply that Q C Q(¢,,) is a Galois exten- 
sion of degree 10. This is not of the form 273, so that ¢,, cannot be constructed by 
origami. <p> 


Example 10.3.8 Leta € C bearoot of f = x°+x+1. Using Maple or Mathematica, 
one easily checks that f is irreducible over Q, so that Q C Q(q@) is an extension of 
degree 6. However, even though 6 = 2-3, a is not an origami number. This follows 
from the galois command in Maple, which shows that the splitting field Q C L of 
f has Galois group Gal(L/Q) ~ Sg. Then Theorem 10.3.6 implies that a ¢ @, since 
[L: Q| = |Gal(L/Q)| = 6! = 2*-3?-5. <> 


You will prove the following corollary of Theorem 10.3.6 in Exercise 9. 


Corollary 10.3.9 Let f(x) € Q|x] be a polynomial of degree < 4. Then the roots of 
f(x) are origami numbers, i.e., we can solve f(x) =0 by origami. . 


The papers [7] and [14] give explicit descriptions of how to solve cubics by 
origami. The paper [7] also treats cubics. 


C. Marked Rulers and Intersections of Conics. Origami numbers can be 
also constructed with a marked ruler or intersections of conics. We will discuss these 
methods briefly (without proofs), beginning with the marked ruler. 

A marked ruler is a straightedge with two marks on it one unit apart. This is 
sometimes called a twice-notched straightedge. A marked ruler can construct a line 
in two ways: first, by connecting two known points, and second, by verging, which 
given a known point P and known lines /, and /, draws a line through P that meets /, 
at Q; and J, at Q2 such that the segment Q)Q> has length 1: 
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A marked-ruler construction begins with the points 0, 1, and i. At each step, one 
constructs a new line by applying either of the two operations just described to the 
already constructed lines and points and then intersecting the new line with other 
already constructed lines to get new points in C. 

Here is a quartic equation that comes from verging with a marked ruler. 


Example 10.3.10 Let/, be the line y = x and I, be the line y = — }x, and let P = (5,0). 
Then let £ be a line with slope m through P. In Exercise 10 you will show that £ 
meets /; at the point 


m 
(10.13) Q; = (41,41) Eh, where x; = om? 


and meets /, at the point 


= ! __m 
(10.14) Q2 = (%2,~-9%2) Eh, where x2 = 5: 
If we think of 2 as the marked ruler, then verging from P with J; and /, means that 
the distance from Q, to Q2 is 1. This gives the equation 


( m m )+( —m m y= 
2m+1 2m—2 2(2m+1) 2m-2/  ’ 
which simplifies to the quartic equation 

(10.15) Tm — 16m? — 21m? +8m4+4=0. 


The roots of this equation are all real and represent the slopes of the four lines through 
P that are constructed with a marked ruler by verging with the lines /; and J). 

In Exercise 9 of Section 13.1, we will see that the Galois group of (10.15) is 54 
(this can also be done using the galois command in Maple). Hence the splitting 
field is an extension of Q of degree 24. This is not a power of 2, so that these lines 
are not constructible with straightedge and compass. <p> 


See Exercise 11 for an example where verging leads to a cubic equation. Given 
that origami also solves cubic and quartic equations, the following result proved in 
[15, Ch. 10] is not surprising. 


Theorem 10.3.11 Leta ¢C. Then a can be constructed using a marked ruler if and 
only if & is an origami number, i.e., a € C. ) 


We next consider conics. These can be defined geometrically in terms of foci, 
directrices, and eccentricities, or one can work algebraically, giving separate treat- 
ments for ellipses, hyperbolas, and parabolas. We will use a third approach, which 
defines a conic to be a curve in the plane defined by an equation of the form 


(10.16) F (x,y) =ax? + bxry+cy*+dx+ey+f=0 


ORIGAMI (OPTIONAL) 281 


where a,b,c,d,e, f € R and (a,b,c) 4 (0,0,0). We also assume that (10.16) has at 
least one solution with x,y real. This excludes equations like x” + y*+1=0. 
We write the equation (10.16) in matrix form as follows. Let 


a 3b 3d 
-|! 1 
A= 3b c ze IT, 
2d 7e f 
and let 
x 
x=|y 
1 
Then one easily checks that 
(10.17) F (x,y) =2'Ax, 


where x’ is the transpose of x. Then the conic C defined by (10.16) is nondegenerate 
if det(A) 4 0. As shown in [2], if C is nondegenerate, then 


b?—4ac<0 <> Cisanellipse, 
b?-—4ac=0 <> Cisa parabola, 
b? —~4dac>0 <> Cisahyperbola. 


To do constructions by intersecting conics, start with 0 and 1, and construct either 
a line connecting two already constructed points or a conic whose coefficients are 
previously constructed real numbers. Then we get new points by intersecting these 
lines and conics. This gives the following set of complex numbers. 


Theorem 10.3.12 Let a €C. Then a can be constructible by intersecting conics if 
and only if a is an origami number, i.e., a € C. a 


Proof: This is proved by Alperin in [1]. Alternatively, Videla shows in [21] that a 
is constructible by conics if and only if a lies in a 2-3 tower. When we combine this 
with part (c) of Theorem 10.3.4, the theorem follows immediately. a 


Putting together the results from this section, we have the following equivalences 
for a complex number a: 


ais an origami number <= a is constructible by marked ruler 
<=> cis constructible by intersecting conics 
<=> aliesina2-3 towerQ=FoC-:-CF, 


<=> ais algebraic over Q, and the Galois group 
of its minimal polynomial has order 273°. 
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Mathematical Notes 
Here are two topics for further discussion. 


« Marked Ruler and Compass. By using a marked ruler and a compass, one can do 
constructions beyond what is possible by marked ruler alone. A marked ruler allows 
us to verge using a point and two lines, but with a compass to draw circles, we can 
also verge using a point and two circles or a point, a circle, and a line. An example 
of the latter is Archimedes’ angle trisection: 


(10.18) 


Here, we have a unit circle centered at O and a point P on the circle that makes the 
indicated angle @ with the line /. Then verging from P with / and the circle gives the 
dashed line containing points Q on the circle and R on / that are one unit apart. In 
Exercise 12 you will prove that 7PRO = 6/3. (Note that we can trisect angles using 
only the marked ruler. Exercise 13 gives such a trisection due to Pappus.) 

Using a marked ruler and compass also enables us to construct points not possible 
by marked ruler alone. An example in Baragar’s paper [4] shows that the real roots 
of the polynomial 


x9 — 4x44 2x34 4x? 4 2x -—6 


can be constructed using a marked ruler and compass (what Baragar calls a “compass 
and twice-notched ruler construction”). This polynomial is irreducible over Q, and 
the methods used to analyze x° — 6x +3 in Section 6.4 imply that its splitting field 
Q c Lhas Galois group Gal(L/Q) ~ Ss. It follows that the roots of this polynomial 
are not expressible in terms of radicals and are not origami numbers, but can be 
constructed using marked ruler and compass. However, it is not known exactly 
which numbers can be constructed in this way. 

In the exercises you will show that marked-ruler-and-compass constructions can 
be interpreted in terms of intersecting conchoids and limacons with lines and circles. 
Further details may be found in [4]. 


« Origami and Dual Conics. Origami and intersections of conics lead to the same 
set of complex numbers. Since origami involves simultaneous tangents to parabolas, 
it is reasonable to ask if origami has an intrinsic connection to intersections of conics. 
The answer involves the dual conic of a parabola, whose points correspond to tangent 
lines of the parabola. Then simultaneous tangents to two parabolas correspond to 
intersections of their dual conics. To make these ideas precise, one needs to work in 
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the projective plane, which is beyond the scope of this book. See [1] for a discussion 
of the ideas involved. 


Historical Notes 


What we call “conics” are more properly called “conic sections,” for they were 
defined by the Greeks as the intersections of a cone with a plane. One of the first 
Greek geometers to consider conic sections was Menaechmus (ca. 350 B.c.). He was 
a student of Plato and Eudoxus. He showed how to duplicate the cube by intersecting 
two parabolas (Exercise 11 of Section 10.1). Thus the idea of solving cubic equations 
by intersecting conics goes back to the very beginning of the study of conic sections. 

In his book On the Heptagon in the Circle, Archimedes (287-212 B.C.) may have 
constructed a regular heptagon using the intersections of conics. Although this book 
no longer exists, works by Islamic geometers such as Thabit ibn Qurra (826-901) on 
the same problem mention Archimedes’ book and use these methods to construct the 
regular heptagon. 

One of the major works of Greek geometry is the Conic Sections by Apollo- 
nius (ca. 262-190 B.c.). This treatise introduced the terms ellipse, parabola, and 
hyperbola. A description of the Conic Sections can be found in [9, Ch. 6]. 

A later writer was Pappus (ca. 300), who wrote extensive commentaries on various 
aspects of Greek geometry. His work contains the first known description of a conic 
section in terms of focus, directrix, and eccentricity, though his description probably 
appeared in earlier but now lost works. Pappus gave a nice angle trisection using 
intersections of conics (see Exercise 14). 

There is also a large Islamic literature on constructions using conic sections. As 
noted by Martin [15, p. 135], over a dozen conic constructions of the regular heptagon 
were found by Islamic geometers during the Middle Ages. Besides Thabit ibn Qurra 
mentioned earlier, another prominent geometer is Abu Ali Hasan ibn al-Haytham 
(ca. 965-1039), known in the West as Alhazen. He is best known for the problem of 
describing reflections in a circular mirror, which he solved by intersecting a hyperbola 
and a circle. 

The emergence of equations for conics, such as (10.16), took a while. The standard 
equations for the ellipse, hyperbola, and parabola are implicit in many of the results 
proved by the Greeks, but it wasn’t until the work of L’Héspital in 1707 that they 
were written down in their modern form. Much more on the history of the conic 
sections can be found in [5] and [6]. 

The marked ruler was first used by Nicomedes (ca. 240 B.C.) to construct cube 
roots using a marked ruler (see Exercise 15). Pappus, in one of his commentaries on 
Greek geometry, described verging as moving a ruler “about a fixed point until by 
trial the intercept [the portion of the ruler lying between the given lines] was found to 
be equal to the given length.” We’ve already seen that Archimedes and Pappus used 
a marked ruler to trisect angles. Nicomedes also introduced the conchoid, which is a 
curve created by verging with a line (see Exercise 16). In 1593, Viéte proposed that 
verging with a marked ruler be allowed for geometric constructions. See [4] and [15] 
for more details on the history of the marked ruler. 
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The connection with paperfolding or origami seems to be more recent. One of the 
earliest references is Geometric Exercises in Paper Folding by T. Sundara Row [19], 
published in Madras in 1893. The origami trisection given at the beginning of the 
section was discovered in the 1970s by Hisashi Abe and is taken from [12]. More 
references on origami can be found in [1], [3], [12], [14], and [16], and Hull’s book 
[13] contains a wealth of activities related to origami and mathematics. We also note 
that origami is equivalent to a construction called “mira” described in [8] and [15]. 


Exercises for Section 10.3 


Exercise 1. This exercise will use the diagram 


to prove that the origami construction described at the beginning of the section trisects the 
angle 6 formed by the line J. and the bottom of the square. 
(a) Let Q be the intersection of the line segments P,Q) and P,Q). Prove that Q lies on the 
dashed line /. 
(b) Prove that # is congruent to a+ (. 
(c) Use triangles AP, PQ, and AP;PQ;, to prove that ( and + are congruent. 
(d) Use triangle AP; QQ, to prove that a is congruent to 8 +7. 
(e) Conclude that a is congruent to 20/3 and that the angle formed by P,Q) and the bottom 
of the square is 6/3. 


Exercise 2. In the text we showed how to trisect an angle between 7/4 and 7/2 by origami. 
(a) Explain how to bisect and double angles by origami. 
(b) Explain how to trisect an arbitrary angle by origami. 


Exercise 3. Let P, be a point not lying on a line /; in the plane. Drop a perpendicular from 

P; to 1; that meets /,; at a point S. Then choose rectangular coordinates such that P; lies on 

the positive y-axis and the x-axis is the perpendicular bisector of the segment P,S. In this 

coordinate system, P; = (0,a) and J; is defined by y = —a, where a > 0. 

(a) The parabola with focus P; and directrix /; is defined to be the set of all points Q that are 
equidistant from P, and J). Prove that it is defined by the equation 4ay = x’. 

(b) Let Q = (xo, yo) be a point on the parabola. Prove that the y-intercept of its tangent line 
1S —yo. 
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(c) Let Q@ = (x0, yo) be a point on the parabola, and let Q; € J; be obtained by dropping a 
perpendicular from Q. Prove that Q, is the reflection of P, about the tangent line to the 
parabola at Q. 

(d) Part (c) proves one direction of Lemma 10.3.1. Prove the other direction to complete the 
proof of the lemma. 


Exercise 4. Show that the tangent line at a point (x1,y1) on the first parabola in (10.9) has 
slope given by 
b 


m= T: 
yi— 73a 


Exercise 5. In the text we showed that the slopes of the simultaneous tangents to the parabolas 
in (10.9) are roots of (10.12). In this exercise, you will give an origami version of this in the 
special case when a = 2 and b= 1. Begin with a square sheet of paper folded so that the 
bottom edge touches the top. This fold will be the positive x-axis, and the left edge of the sheet 
will be the directrix for the first parabola in (10.9). 

(a) Describe the origami moves one would use to construct the foci and directrices of the 
parabolas in (10.9) when a = 2 and b = 1. Also construct the y-axis. Exercise 7 will be 
helpful. 

(b) Now perform an origami move that takes the focus of each parabola to a point on the 
corresponding directrix. Explain why there is only one way to do this. 

(c) Part (b) gives a line whose slope m is the real root of xe +2x+1. Explain what origami 
moves you would use to find the point on the x-axis whose coordinates are (m,0). 


Exercise 6. Suppose that in the situation of C3, we have points a1 4 a2 not lying on lines 
£, # £2. Also assume that @; and £ are parallel and that there is a line @ satisfying C3 (i.e., @ 
reflects a; to a point on @; for i= 1,2). Prove that the distance between £2 and £2 is at most the 
distance between a; and a2. This makes it easy to find examples where the line described in 
C3 does not exist. 


Exercise 7. Consider the parabolas (y — tay = 2bx and y = 5x? from (10.9). 

(a) Show that the first parabola has focus (48, 5a) and directrix x = — 5b. 

(b) Show that the second parabola has focus (0, }) and directrix y = —}. 
Hence the focus and directrix of the first parabola are defined over any subfield of R containing 


aand b. For the second, this is true over any subfield of R. 
Exercise 8. Complete the proof of Theorem 10.3.6 sketched in the text. 
Exercise 9. Prove Corollary 10.3.9. 


Exercise 10. In Example 10.3.10, prove that 2 meets /, and /2 at the points Q; and Q2 given in 
(10.13) and (10.14). Also draw the four lines whose slopes are the roots of (10.15). 


Exercise 11. This exercise will give an example of a cubic equation that arises from verging. 
Consider the lines /, defined by y = 0 and J, defined by y = x and verge from P = (1, ) using 
a marked ruler. Show that this gives the vertical line x = 1 together with three nonvertical lines 
whose slopes m satisfy the cubic equation 


4m? +m? —4m+1=0. 
Also show that the nonvertical lines cannot be constructed by straightedge and compass. 


Exercise 12. Prove that ZPRO = 6/3 in the construction (10.18). 
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Exercise 13. According to [15], Pappus used a marked ruler to trisect angles as follows. Given 
an angle 0 < 6 < 7/2, write it as 9 = ZPOA, where: 


e The distance between P and O is 1/2. 
@ The line /; determined by P and A is perpendicular to the line determined by O and A. 


Any angle 0 < 6 < 1/2 can be put in this form by a marked-ruler construction. Finally, let /2 
be the line through P that is perpendicular to /,. Then verging with O and the lines /; and 
gives points Q € 1, and R € ly such that Q and R are one unit apart: 


Prove that ZQOA = 6/3. 


Exercise 14. As explained in [21], Pappus used intersections of conics to trisect angles as 
follows. Consider the unit circle centered at the origin, and let @ satisfy 0 < 0 < 2/2. Then 
P = (cos@,sin@) is the corresponding point on the unit circle. We assume that P is known. 
Also let O = (0,0) be the origin, and set A = (1,0). Thus 6 = ZPOA. 

(a) Consider the curve C consisting of all points @ = (x,y) such that the distance from P to 
Q is twice the distance from Q to the x-axis. The curve C intersects the unit circle at a 
point R lying in the interior of ZPOA. Prove that ZROA = 6/3. 

(b) Show that the curve C is a hyperbola. It follows that we have trisected an angle using the 
intersection of a hyperbola and a circle, i.e., an intersection of conics. 


Exercise 15. In this exercise, we discuss a marked-ruler construction of cube roots due to 
Nicomedes and taken from [15]. Let k be a real number such that 0 < k < 8, and consider 
an isosceles triangle AABC such that AC and BC have length | and AB has length k/4. Then 
extend AC and AB as indicated in the picture below, and choose D on the extension of AC so 
that AD also has length 1. Finally, draw the line through D and B. 


Verging from C with the lines /, and /2 indicated above gives points Q € [, and R € 2 that are 
one unit apart. Assume that Q 4 D. 

(a) Explain why the restriction 0 < k < 8 is necessary. 

(b) Prove that the distance between B and R is v/k. 

(c) Explain how to give a marked-ruler construction of Wk for any k > 0. 
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Exercise 16. Let P be a point distance b > 0 from a line /. Put a marked ruler though P with 
one mark at R € 1. When R moves along /, the other mark Q) or Q2 (depending on which side 
of J it is on) traces out the conchoid of Nicomedes. When b < | we get the picture 


We can relate the conchoid to construction problems as follows. 

(a) Suppose we are given a point P and lines 1, /2, and assume that P ¢ l). Prove that a 
point Q is obtained by verging with P and /, l2 if and only if Q is one of the points of 
intersection of /2 with the conchoid determined by P and 1. 

(b) Prove that the angle trisection of (10.18) can be interpreted as the intersection of the unit 
circle with the conchoid determined by P and J. 

(c) Suppose that P = (0,0) and / is the horizontal line y = —b. Prove that the polar equation 
of the conchoid is 

r=bcesc6+1, 
where the minus sign gives the portion of the curve above / and the plus sign gives the 
portion below. 

(d) Under the assumptions of part (c), show that the Cartesian equation of the conchoid is 


(x +y)(y-—b) =y’. 


By part (a), verging is the same as intersecting the conchoid with a line. Since the above 
equation has degree 4, this explains why verging leads to an equation of degree 4. 


Exercise 17. Let P be a point on a circle, and consider a marked ruler that goes through P. 
If we place one mark on a point Q on the circle, then the other mark R| or R2 (depending on 
whether it is inside or outside the circle) traces out a curve called the limacgon of Pascal: 


Ri 


rk ) 


This curve was known to Jordanus Nemorarius (1225—]260) and Albrecht Durer (1471-1528) 
and possibly the ancient Greeks. It was rediscovered by Etienne Pascal (father of Blaise Pascal) 
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about a century after Diirer. In 1650 Roberval, unaware of the earlier work, named the curve 
in Pascal’s honor. 
(a) Show that the angle trisection (10.18) can be interpreted as the intersection of the line / 
with the limagon determined by the circle and the point P. 
(b) Let P = (0,0) and let C be the circle of radius a and center (a,0). Show that the 
corresponding limacgon has polar equation 


r=1+2acos@. 
(c) In the situation of part (b), show that the Cartesian equation of the limacon is 
(x? +y?—2ax)? =x? +y’. 


Exercise 18. A Pierpont prime is a prime p > 3 of the form p = 2*3' + 1. Prove that a regular 
n-gon can be constructed by origami (or by marked ruler or by intersections of conics) if and 
only if n = 273° p, --- ps, where a,b > O and pi,..., ps are distinct Pierpont primes. This was 
first proved by Pierpont in [17]. 
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CHAPTER 11 


FINITE FIELDS 


The main topic of this chapter is the theory of finite fields. We will study their 
existence and uniqueness and compute their Galois groups. We will also consider 
irreducible polynomials over finite fields. 


11.1. THE STRUCTURE OF FINITE FIELDS 


In this section we discuss the basic properties of finite fields. 


A. Existence and Uniqueness. The simplest examples of finite fields are F,, 
the integers modulo a prime p. These relate to arbitrary finite fields as follows. 


Proposition 11.1.1 Let F be a finite field. Then: 
(a) There is a unique prime p such that F contains a subfield isomorphic to Fy. 
(b) F is a finite extension of Fp, and 
|F| =p", where n= [F :F,]. 
Proof: Every field of characteristic 0 contains a subfield isomorphic to Q and 


hence is infinite. Thus F has characteristic p for some prime p. Furthermore, the 
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discussion of characteristic in Section A.1 shows that pZ C Z is the kernel of the ring 
homomorphism that sends m € Z tom-1€ F. By the Fundamental Theorem of Ring 
Homomorphisms, F contains a subfield isomorphic to Z/pZ = F,. 

The map F, — F makes F an extension field of F,. Following our usual practice 
(see Definition 3.1.2), we identify F,, with its image and write F, C F. Now consider 
F as a vector space over F,. The elements of F give finitely many vectors in F, whose 
span over F, is obviously F. It follows that F is a finite-dimensional vector space 
over F,. As in Section 4.3, this means that F is a finite extension of F,. Furthermore, 
if n = |F : F,], then we can find a basis a,...,a, of F over F,. Hence every element 
of 8 € F can be written uniquely as 


B=ayjayt+---+4nQn, a; € Fy. 


Since the a; can be any of the p elements of F,,, there are p” possibilities for 6. Thus 
|F| = p”. This completes the proof. 7 


For the rest of this chapter, we will assume as in the above proof that a finite 
field F contains F, as a subfield. Our first major result is that F is the splitting field 
over F, of a particularly simple polynomial. 


Theorem 11.1.2 Let F be a finite field with q = p” elements. Then: 
(a) af =aforallac F. 

(b) x? -x=J],ep-(x- a). 

(c) F is a splitting field over F, of x4 — x € F,|x]. 


Proof: Since F has q elements, its multiplicative group F* = F \ {0} is a group with 
q—1celements. It follows that af—! = 1 forall a € F*, so that a? = a forall a € F. 
This proves part (a) and shows that the g elements of F are roots of x?—x. Then 
part (b) follows since x? — x is monic of degree g. Hence x? — x splits completely 
over F. Since every element of F is a root, x? — x can’t split completely over any 
strictly smaller field. Thus F is a splitting field of x? — x € F,[x]. . 


Using this theorem, we obtain the following uniqueness result for finite fields. 
Corollary 11.1.3 Two finite fields with the same number of elements are isomorphic. 


Proof: Corollary 5.1.7 implies that any two splitting fields of x7 — x € F,[x] are 
isomorphic. Then Corollary 11.1.3 follows immediately from Theorem 11.1.2. a 


We next show that a finite field of order p” exists for any p and n. 


Theorem 11.1.4 Given any prime p and any positive integer n, there is a finite field 
with p” elements. 


Proof: Letq =p”, and let L be an extension of F, such that x? — x splits completely 
over L. Since we are in characteristic p, the derivative of x? —.x is —1, so that 
gced(x? — x, (x7 —x)') = 1. Thus x? — x is separable and hence has distinct roots in L. 
This means that F = {a € L| a4 = a} is a subset of L consisting of g elements. In 
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Exercise 1 you will show that F is a subfield of L. It follows that F is a finite field 
with g = p” elements. rT] 


Given any g = p” as in Theorem 11.1.4, the finite field of order g constructed in 
the theorem is unique up to isomorphism by Corollary 11.1.3. Hence we can speak 
of “the” finite field with q elements. We will denote this field as F,. Since these fields 
were first described by Galois (see the Historical Notes), F, is sometimes denoted as 
GF(q), where “GF” stands for “Galois Field.” 

One can use Theorem 11.1.2 to count the number of roots of a polynomial in a 
finite field as follows. 


Proposition 11.1.5 Jf f € F,[x] is nonconstant and n > 1, then the number of roots 
of f in F.» is the degree of the polynomial gcd( fx?" — x). 


Proof: Let g = gced(f,x?" — x), where the gcd is computed in F, [x]. A useful 
observation is that if one replaces F, with any larger field, then one gets the same 
polynomial g (you will prove this in Exercise 2). Thus we may compute the gcd in 
F(x]. If we denote the elements of this field by a; for i= 1,...,p”, then 


xP x= (x—a@1)-+-(x— ap) 
by part (b) of Theorem 11.1.2. This is the irreducible factorization of x?" —x in 


EF» [x]. Hence g is the product of those x — a; that divide f. Since x — a; divides f if 
and only if f(a;) = 0, we obtain the product formula 


g= Il (x — a). 


f(ai)=0 
The proposition now follows immediately. 7 
Here is an example to illustrate Proposition 11.1.5. 
Example 11.1.6 Consider the polynomial 
f=x'42°42r+1€ Ff. 


To compute the number of roots in F7:, we need to compute gced( f x? —x). In Maple, 
we do this using the command 


Gcd(f,x°(7°3)-x) mod 7; 
In Mathematica, we would type 
PolynomialGCD[f, x*(7°3)-x, Modulus -> 7] 
In both cases, the output is the polynomial 


xt 4x7 4x46. 
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By Proposition 11.1.5, f has three roots in F7;. Furthermore, replacing 7°3 with 7°4 
in the above computation gives a gcd of 1, so that f has no roots in Fy. 

One drawback of this method is that the degree of x” — x increases rapidly. For 
example, if we replace 7*3 with 7°8 in the above computation, then Maple gives 


x84 3x7 4294354 5x4 42° 44x46 


(thus f has eight roots in Fjs), whereas Mathematica gives an error message because 
the degree of x” — x = x576480l _ y is too large for PolynomialGCD. <p 


B. Galois Groups. We next compute the Galois group of the extension F, C Fy. 


Theorem 11.1.7 If gq = p", then: 

(a) F, C F, is a Galois extension of degree n. 

(b) The map Frob, : F, > F, defined by Frob,(a) = a? is an automorphism of F, 
that is the identity on F,. 

(c) Frob, generates Gal(F,/F,). Thus there is a group isomorphism 


Gal(F,/F,) + Z/nZ 
that sends Frob, € Gal(F,/F,) to [1] € Z/nZ. 


Proof: In the proof of Theorem 11.1.4, we noted that x? — x is separable. Then 

Theorem 11.1.2 implies that F, is the splitting field of a separable polynomial. Hence 

F, C F, is Galois. Proposition 11.1.1 implies that (F, : F,] = since g = p”. 
Turning to part (b), observe that by Lemma 5.3.10, 


Frobp(a + 8) = (a+ 8)? = a? + B? = Frob,(a) + Frob, (8). 
Since we also have Frob,(1) = 1? = 1 and 
Frob, (a8) = (a8)? = a? B? = Frob,(a)Frob,(), 


it follows that Frob, is a ring homomorphism. By Exercise 2 of Section 3.1, Frob, is 
also one-to-one and hence onto, since it maps the finite set F, to itself. Thus Frob, is 
an automorphism of F,. Since it is the identity on F, by Lemma 9.1.2, we conclude 
that Frob, € Gal(F,/F,). 

For part (c), we first note that since F, C F, is Galois, we have 


|Gal(F,/F,)| = (Fy :Fp] =n, 


where the second equality uses g = p” and Proposition 11.1.1. It follows that the 
order of Frob, divides n. Suppose that (Frob,)’ is the identity, where 0 < r <n. 
Here, (Frob,)” denotes the r-fold composition of Frob, with itself, so that 


r times r times 


r 


aN, 
(Frob,)’ (a) = Frob,(--- Frob,(Frobp(@))---) = (---(a@?)?---)P =a? . 
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Thus, if (Frob,)” is the identity element of Gal(F,/F,), then 


a? =a 
for all a € Fj. Since 0 <r <n, this implies that the polynomial x?" — x of degree 


p’ < p" = qhas q roots, which is clearly impossible. Hence Frob, has order 2, which 
easily gives the desired isomorphism Gal(F,/F,) ~ Z/nZ. 7 


We call Froby the Frobenius automorphism of F,. We next use Theorem 11.1.7 
to determine when one finite field is contained in another. 


Corollary 11.1.8 Let F,» and Fy be finite fields. Then Fy» is isomorphic to a subfield 
of F » if and only if m|n. 


Proof: First suppose that F,» is isomorphic to a subfield of F,.. Writing this as an 


inclusion, we obtain 
Fy C Fy» C Fp. 


Proposition 11.1.1 and the Tower Theorem imply that 
n= [Fp : Fp] = [Fo : Fp] [Fp» : Fp] = [Fo : Fp»]m. 


This shows that m divides n. 

Conversely, suppose that m|n. Since Gal(F,./F,) is cyclic of order n by Theo- 
rem 11.1.7, we know that Gal(F,. /F,) has a subgroup H of order ©. By the Galois 
correspondence of Section 7.3, the fixed field F of H is an extension 


F, CF CE, 


satisfying 
n 
F :F,] = [Gal(f,. /F,): 4] = —— = 
p| = [Gal(F/F,) : H] nim 
Using Proposition 11.1.1, we see that F has order p”. By Corollary 11.1.3, it follows 
that F is a subfield of F,, isomorphic to Fy». a 


In Exercise 3 you will prove Corollary 11.1.8 using neither Theorem 11.1.7 nor 
the Galois correspondence. 
When m|n, Corollary 11.1.8 gives Fn ~ F C F,». As usual, we identify F,» with 
F,, which gives the inclusion 
Foy» C Fp. 


Then we can generalize Theorem 11.1.7 as follows. 
Theorem 11.1.9 Let m|n and Fy» C Fy». Then there is a group isomorphism 
Gal(Fy /F») ~ Z/2Z 


that sends (Frob,)™ € Gal(F/Fp») to [1] € Z/2Z. 
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Proof: You will prove this in Exercise 4. rT] 


This result makes it easy to work out the Galois correspondence of F, C F,.. The 
key point is that subfields of F,. correspond to subgroups of Z/nZ, yet subgroups of 
Z/nZ correspond to positive divisors of n. Here is an example inspired by the classic 
book [13] by Lidl and Niederreiter. 


Example 11.1.10 For F 0, the above remarks show that subfields of F3 correspond 
to positive divisors of 30. This gives the following Galois correspondence: 


Fyx0 {0} 
LAN ZzI™N 
Fs Fp. Fis Z/5Z Z/3Z Z/2Z 
IX *K] LX KI 
Fy Fy Fs Z/15Z Z/10Z Z/6Z 
NI “a NS 


To understand the diagram, recall that an intermediate field F, C F C Fy  corre- 
sponds to the subgroup Gal{F 2» /F) C Gal(Fj%/F,) ~ Z/30Z. Combining this with 
Theorem 11.1.9, we see that F = F.» corresponds to Gal (F 0 /F) ~ Z/2Z. When 
we do this for each divisor of 30, we obtain the above diagram. <b> 


Mathematical Notes 


Finite fields are used in many different areas of mathematics. Here are three of 
particular importance. 


= Finite Groups. If F is any field, then the set GL(n, F) of invertible n x n matrices 
with entries in F is a group under matrix multiplication. In particular, if F is a 
finite field, then GL(n, F) is a finite group. These groups play an important role in 
both Galois theory and the theory of finite groups. We will say more about this in 
Section 14.3. 


« Equations over Finite Fields. Given a nonzero polynomial f € R{x,y], the so- 
lutions of f(x,y) = 0 lying in R? form a curve in the plane. For instance, the unit 
circle is defined by x+y? — 1 =0. Similarly, given a polynomial f € F,[x,y|, we 
can consider solutions of f(x,y) =0 lying in Fy. Such equations were of interest to 
Gauss. For instance, the equation x? + y?+xy* = 1 mod p appears in the last entry 
in his mathematical diary [10]. 

Things get more interesting when one realizes that for f € F,|x,y|, we can also 
consider solutions of f(x,y) = 0 lying in Fy, for any n > 1. Since Fy is finite, there 
are only finitely many such solutions, though as n gets larger, the number of solutions 
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increases. This is all controlled by the zeta function of the equation. See [9, Ch. 5], 
(11, §11.5], and [12, pp. 158-160] for more about zeta functions. 


= Coding Theory. Information in a computer is often represented as a string of 0’s 
and 1’s. A good example is the ASCII code, which uses numbers between 0 and 
127 to represent letters, digits, punctuation marks, and common special characters 
on computers. The ASCII value of “g” is 103 (decimal) = 1100111 (binary), while 
“G” is “071” (decimal) = 1000111 (binary). Thus the ASCII code uses FJ, which is 
a vector space of dimension 7 over F. 

In algebraic coding theory, a code consists of a subset of F” (the set of code 
words), where qg is usually a power of 2. In the study of error-correcting codes, 
one wants the code words to be widely spaced so that errors in transmission can be 
detected, yet if they are too widely spaced, the code is not very efficient. Finding 
good codes is an active area of research. An introduction to coding theory can be 
found in Lidl and Niederreiter [13, Sec. 9.1]. One surprise is that an important class 
of codes (the so-called Goppa codes) involves equations f(x,y) = 0 over finite fields. 
See Moreno’s book [14] for an introduction. 


Historical Notes 


The finite field F, first arose in the study of congruences modulo the prime p. 
For an arbitrary modulus , congruences are implicit in the work of Fermat, Euler, 
Lagrange, and Legendre, though Gauss was the first to give an explicit definition. 
All of these people knew that congruences modulo a prime have special properties. 
For example, in Disquisitiones, Gauss states the following result: 


A congruence of the mth degree 
Ax” + Bx™—| +x"? + etc. + Mx+N =0 


whose modulus is a prime number p that does not divide A, cannot be solved in 
more than m different ways, that is, it cannot have more than m noncongruent 
roots relative top... 


(See [8, Art. 43].) In modern terms, this says that a polynomial in F,|x] of degree m 
has at most m roots in F,, which is true because F, is a field. Exercise 5 will give an 
example of how this can fail when the modulus is not a prime. 

In 1830, Galois published Sur la théorie des nombres in the Bulletin des sciences 
mathématiques de Ferussac (see [Galois, pp. 113-127]). Galois begins the paper by 
noting that for congruences F (x) = 0 mod p as above, “one customarily considers 
only integer solutions.” If instead one considers “incommensurable” solutions, Galois 
states that “I have arrived at certain results that I think are new.” Essentially everything 
in this section can be traced back to these results of Galois. 

His construction goes as follows. Consider a polynomial F € Z[x| of degree v > 1 
whose reduction modulo p in F,[x] is irreducible of degree v (Exercise 6 explains 
how Galois stated this). Since v > 1, it follows that the congruence 


(11.1) F(x) =0 mod p 
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has no integer solutions. Thus, as stated in [Galois, p. 113], “One must regard the 
roots of this congruence as a kind of imaginary symbol.” In analogy with the usual 
symbol for \/—1, Galois uses i to denote a root of the congruence (11.1). Then he 
considers expressions of the form 


a=ataitai?+---+a,_i", 


where a,q),...,@,— are integers modulo p. There are p” different choices for a, 
which give the finite field with p” elements. Galois proves the following facts about 
this field: 
e a? —!— 1] whena £0. 
e The elements of this field are the roots of x? — x (Theorem 11.1.2). 
e All irreducible polynomials of degree v lead to the same field (Corollary 11.1.3). 
e Primitive roots exist, i.e., the nonzero elements of the field form a cyclic group 
under multiplication (Proposition A.5.3). 
Galois also knew Proposition 11.1.5, which he states as follows: 
Next, to get the integer solutions, it suffices, as M. Libri appears to have been 
the first to remark, to find the greatest factor common to Fx = 0 and x? “hay, 
If now one wants to have imaginary solutions of the second degree, one 
finds the greatest factor common to Fx = 0 and Pla 1, and in general, the 
solutions of order v will be given by the greatest factor common to Fx = 0 and 
x? =1,. 
(See [Galois, pp. 123-125].) Note that Galois uses an equal sign = to denote 
congruence modulo p. Do you see how his version of Proposition 11.1.5 counts only 
nonzero solutions? 

Galois’s arguments are sketchy and assume the existence of a root i of (11.1). 
Nevertheless, his account is remarkably complete. For instance, given an arbitrary 
v > |, he uses a splitting field of x?” — x to prove the existence of a polynomial F of 
degree v that is irreducible modulo p. This is his way of proving Theorem 11.1.4. 
As for the Galois group Gal(F,» /F,), Galois doesn’t state Theorem 11.1.7 directly. 
However, if iis a root of (11.1), then Galois knew that the other roots are given by 
PP, ...51?”', which is the usual way we use the Galois group to find the other 
roots of an irreducible polynomial (see Exercise 7). Also, in Section 14.3 we will see 
how Galois used Gal(F,. /F,) to construct some interesting matrix groups. 

However, Galois was not the only person to discover finite fields. Gauss described 
a theory of “higher congruences” in an unpublished manuscript from around 1800, 
and Schonemann described a theory of finite fields based on congruences in 1846. 
Here is Schonemann’s approach. Let f € Z|x] be monic of degree n whose reduction 
modulo p is irreducible in F,[x]. Also let a € C be a root of f. Then Schonemann 
considers expressions of the form (a), where y € Z[x], and he writes 


(11.2) g(a) = ¥(a) (mod .p,a) 


to mean y(a) = (a) + pR(a), where y, ,R € Z|x]. Using this, he shows that every 
(a) is congruent modulo (p, @) to an expression of the form 


-1 
Gn—1" + +++ + 4,4 ap, 
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where 0 < a; < p—1 fori=0,...,1—1. There are p” such expressions. 
From a modern perspective, (a) lies in the ring Z[a], and the congruence classes 
of (11.2) give the quotient ring 


(11.3) Z\a)/pZ{al, 


where pZja] is the ideal of Z[a] generated by p. It follows that Schonemann’s 
construction leads to a ring with p” elements. 

We prove that this ring is a field as follows. In Exercises 8 and 9 you will show 
that since f is monic, the evaluation map g(x) +> q(q) induces a ring isomorphism 


Z\x\/ fZ|x| ~ Za]. 
Furthermore, you will also show that this isomorphism induces an isomorphism 


Z\x|/(p, f) = Zla]/pZlal, 


where 


(11.4) (p, f) = pZ[x| + fZ[x] = {pR(x) + f()S(x) | R(x), S(x) € Zh] }. 


Thus we can interpret Schonemann’s construction as taking the quotient of Z[x] by 
(p, f) in two steps: first quotient out by f to get Z[a], and then quotient out by p to 
get (11.3). However, we can reverse the order of the steps: first quotient out by p 
to get Z{x]/pZ|x] ~ F,[x], which sends f to f € F,[x], and then quotient out by f to 
obtain the isomorphism 


Z{x]/(p.f) ~ Folxl/(f) 


(see Exercise 9). Since f € F,[x] is irreducible, the results of Chapter 3 show that 
F, [x]/(f) is an extension field of F, in which f has a root (namely, the coset of x). 
Combining these isomorphisms gives 


(11.5) Z{a]/pZlal ~ Z[x|/(p, f) ~ Fp[xl/(f), 


which proves that Schonemann’s construction gives a finite field with p” elements. 
Also, this isomorphism sends the coset of a to the coset of x. If we let i denote this 
coset, then we recover Galois’s construction of the finite field. So everything fits 
together nicely. See [5] for more on Schonemann’s work on finite fields. 

Besides the constructions of Sch6nemann and Galois, the isomorphisms (11.5) 
show that we can represent a finite field with p” elements as 


(11.6) ZIx\/(p,f) 


whenever f is monic of degree n and irreducible modulo p. This construction is 
implicit in Schonemann’s work and was made explicit by Dedekind in 1857. Unlike 
Galois (who assumed the existence of 7) and Schonemann (who used the Fundamental 
Theorem of Algebra to find a), Dedekind’s construction is purely algebraic and 
became the standard way to define finite fields (see, for example, Dickson’s book [6] 
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from 1901). In 1893 E. H. Moore showed that any finite field is isomorphic to one 
of the form (11.6). He also introduced the notation GF(p”) for the finite field with 
p" elements constructed in (11.6), although these days the notations GF(p”) and F,» 
are often used interchangeably. 

For more details on the history of finite fields, we refer the reader to [7, Vol. I, pp. 
233-252] and [13, pp. 73-78]. 


Exercises for Section 11.1 


Exercise 1. Let F, C L be an extension such that x? — x splits completely over L, where g = p”, 
and let F be the set of roots of this polynomial. Prove that F is a subfield of L. 


Exercise 2. Suppose that f,¢ € F [x] are polynomials, not both zero, and let # be their greatest 
common divisor as computed in F [x]. Now let L be an extension field of F. Prove that A is the 
greatest common divisor of f, g when considered as polynomials in L|x]. 


Exercise 3. Give a proof of Corollary 11.1.8 that uses neither Theorem 11.1.7 nor the Galois 
correspondence. 


Exercise 4. Prove Theorem 11.1.9. 


Exercise 5. As noted in the text, if f € Z[x] has degree n and its leading coefficient is not 
divisible by a prime p, then f(x) =0 mod p has at most 1 solutions modulo p. Here are two 
questions that explore what happens when n = 2 and the modulus is arbitrary. 
(a) How many solutions does the congruence x” — 1 = 0 mod 8 have modulo 8? 
(b) Fix an integer m > 1, and assume that every polynomial of degree 2 in Z/mZ[x] has at 
most two roots in Z/mZ. Is m prime? 


Exercise 6. Let F € Z[x] have degree n, and assume that the leading coefficient of F is not 
divisible by p. Prove that the reduction of F modulo p is irreducible over F, if and only if it 
is not possible to find polynomials y, 7, ~ € Z[x], where deg(y), deg() <n, such that 


p(x)b(x) = F(x) + px(x). 
This is how Galois defines irreducibility modulo p in [Galois, p. 113]. 


Exercise 7. Let f € F,[x] be irreducible of degree v. Use (7.1) and Theorem 11.1.7 to prove 
Galois’s observation that if i is one root of f in a splitting field, then the other roots are given 
by i ,...,iP° 

Exercise 8. Let / and J be ideals in a ring R, and let 2 +J = {r+s|r€J,s € J} be their sum. 
Also let 7 = {r+J |r € J}. This is a subset of the quotient ring R/J. 

(a) Prove that J+J is an ideal of R and that 7 is an ideal of R/J. 

(b) Show that the map r+ (J+J) + (r+J) +7 defines a well-defined ring isomorphism 

R/(4+J) ~ (R/J)/T. 


Exercise 9, Let f € Z[x| be monic and irreducible, and let a € C be a root of f. Then let 
f € F,[a] be the reduction of f modulo the prime p, and let (p, f) be as in (11.4). 
(a) Prove that the map g(x) + fZ[x] ++ q(q) is a well-defined ring isomorphism Z[x] /fZ[x] ~ 
Z{al. 
(b) Use Exercise 8 to prove that Z|x]/(p, f) ~ Z[al/pZ|al. 


(c) Similarly prove that Z[x]/(p, f) ~ Fpl /(f). 
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Exercise 10. Let f = 2+ 2+ 2x? +. 2x? + 2x4 + 20° + 2x6 4 2x7 4x8 42° +2! € F3[x]. 
(a) Use the method of Example 11.1.6 to determine the number of roots of f in F33 and F37. 
(b) Explain why the splitting field of f over F3 is F,21. 


Exercise 11. Let f € F,[x] be an irreducible polynomial of degree n. Prove that f splits 
completely in F,-. 


11.2 IRREDUCIBLE POLYNOMIALS OVER FINITE FIELDS (OPTIONAL) 


Our presentation of finite fields in Section 11.1 was very abstract—it wasn’t until the 
Historical Notes that we explicitly constructed F,, as F,|x]/(f), where f € F,[x] is 
irreducible of degree n. Yet whenever finite fields are implemented on a computer, 
such a representation is essential. It follows that we need a good understanding of 
irreducible polynomials in F,[x]. 


A. Irreducible Polynomials of Fixed Degree. We begin with the following 
easy result concerning irreducible polynomials in F,,[x]. 


Proposition 11.2.1 Let f € F,|x| be irreducible of degree m. Then: 

(a) f divides xP” — x, 

(b) f is separable. 

(c) Given an integer n > 1, f divides x?" —x = f has root in Fy © min. 


Proof: We begin with part (c). Let a bea root of f in a splitting field over F,. Since 
f is irreducible, F, C F,(a) has degree m. Then F,(a) ~ Fp» by Proposition 11.1.1 
and Corollary 11.1.3. From here, the second equivalence of part (c) follows directly 
from Corollary 11.1.8. Since f is irreducible over F,, we also have f|ged(f,x?" — x) 
if and only if deg(gcd(f,x?" —x)) > 0. Then the first equivalence follows from 
Proposition 11.1.5. 

We get part (a) by taking n = m in part (c), and part (b) follows immediately, since 
x?" — xis separable by the proof of Theorem 11.1.4. . 


More generally, one can show that if F is any finite field and f € F |x] is irreducible, 
then f is separable (see Exercise 1). Since irreducible polynomials in characteristic 
O are always separable, it follows that nonseparable irreducible polynomials can 
occur Only over infinite fields of characteristic p. Do you see how this relates to the 
examples of such polynomials presented in Example 5.3.11? 

We next want to count the number of irreducible polynomials of fixed degree in 
F,|x]. This number is finite, since we are working over a finite field. Furthermore, 
since any irreducible polynomial becomes monic after multiplying by a suitable 
constant, it suffices to compute the number 


Nm = |{f € Fp[x] | f is monic irreducible of degree m}}. 


These numbers are related as follows. 
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Theorem 11.2.2 Let N,, be as defined above. Then, for any n > 1, we have 
> mNn = p", 
m|n 

where the sum is over all positive divisors of n. 


Proof: Since x?" — x is separable, we know that it factors as a product of distinct 
irreducible polynomials in F,|x]. Furthermore, since it is monic, we can assume 
that the polynomials in the factorization are also monic. Finally, part (c) of Proposi- 
tion 11.2.1 shows that the polynomials in the factorization are al! monic irreducible 
polynomials of F, [x] whose degree m divides n. 

This allows us to write x?" — x as follows. Let 


Nn ={f € F, [x] | f is monic irreducible of degree m}, 


so that Nj, = |-%n|. Then the previous paragraph implies that 


(11.7) x" —x=]] |] Ff 


m|n fE Mn 


(be sure you understand why). Since every f € .%, has degree m, taking the degree 
of each side of (11.7) gives the desired formula. r 


Here is an example of Theorem 11.2.2. 


Example 11.2.3 The monic irreducible polynomials of degree 1 in F,|[x] are of the 
form x — a fora € F,. Thus N; = p. Then the theorem implies that 


p? =2N) +N, = 2N) +P, 
so that Ny = $(p? — p). In Exercise 2 you will use this to prove that 
(11.8) Ns = 1(p* — p’). 


These formulas show that Nz and N4 are positive, which implies that we can find 
irreducible polynomials of degrees 2 and 4. In particular, this proves the existence of 
finite fields of orders p? and p’*. <I> 


To generalize the formulas in this example, we will use the Mobius function y(n) 
from Exercise 14 of Section 9.1. This function is defined by 


1, ifn=1, 
b(n) = 4 (-1)°, ifn=p,--: ps for distinct primes p1,..., Ps, 
0, otherwise. 


Then we have the following formula for N,. 
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Theorem 11.2.4 The number of monic irreducible polynomials of degree n in Fp|x] 
is given by 


Proof: Let F be a complex-valued function defined on the set of positive integers. 
Then we get another such function G defined by 


G(n) =5— F(n), 


min 


where as usual the sum is over all positive divisors of n. The Mdbius inversion 
formula asserts that in this situation, we can express F in terms of G as follows: 


A proof of the Mobius inversion formula can be found in most books on number 
theory. See (11, Sec. 2.2] or [15, Sec. 4.3]. 
In particular, if F(n) =nN,, then Theorem 11.2.2 implies that 


G(n) = 5° F(m) = SN, = p”. 


m|n m|n 


By the inversion formula, we obtain 


nN, = F(n) = So ulm) G(2) = Salm) pe. 


min mln 
The desired formula follows immediately. r 
Here are some examples of how this theorem works. 
Example 11.2.5 When n = 4, Theorem 11.2.4 implies that 
Na = £(w(1)p? + u(2)p? + u(4)p*) 
= 4(1-p*+(-1)-p?+0-p) 
i (p*— p*), 


which agrees with (11.8). Similarly, when n = 6, you will show that 


No = }(p®— p?— p? +p) 
in Exercise 3. <a 


In Exercise 4 you will use Theorem 11.2.4 to prove that N, > 0 for all n > 1. 
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B. Cyclotomic Polynomials Modulo p. Theorem 11.1.2 tells us that a? = a 
for all a € F,. It follows that a¢—' = 1 when a £ 0, so that every nonzero element 
of F;, is a root of unity. In characteristic 0, the minimal polynomial of a primitive dth 
root of unity in C is the cyclotomic polynomial ®,(x). We will now explore what 
happens when we reduce these polynomials modulo p. 

By Section 9.1, ®4(x) is the monic polynomial whose roots are the primitive dth 
roots of unity in C. Furthermore, ®7(x) has integer coefficients, is irreducible of 
degree ¢(d), and has the factorization 


(11.9) x"-1=][ a(x) 


d\n 


in Z|x]. The reduction of ®4(x) modulo p should be denoted ®,(x), but for simplicity 
we will denote it by ,(x). Thus (11.9) becomes an identity over in F,[x]. We 
will concentrate on the case when gcd(d,p) = 1. This restriction is explained in 
Exercise 5. 

We begin by describing the roots of ®z(x). Recall that a dth root of unity a is 
primitive if d is the smallest positive integer such that a? = 1. 


Proposition 11.2.6 If gcd(d, p) = 1 and q = p", then the following are equivalent: 
(a) dilq—1. 

(b) ®g(x) splits completely in Fo. 

(c) @g(x) has a root in Fg. 

Furthermore, when these conditions are satisfied, the roots of ®a(x) in Fy consist of 
the primitive dth roots of unity. 


Proof: We first study the primitive dth roots of unity in characteristic p. Observe 
that x? — 1 is separable, since gcd(d, p) = 1. Hence x? — | has d roots in a splitting 
field. These roots form a group under multiplication, which is cyclic of order d by 
Proposition A.5.3. Such a group has ¢(d) generators (Exercise 10 of Section 9.1), 
so that there are ¢(d) primitive dth roots of unity. By Exercise 11 of Section 9.1, 


d=) 4(2), 
eld 


and by (11.9) for n = d, 
x?—-|= [[&-@.- 


eld 
From these facts, it is straightforward to prove by complete induction on d that the 
roots of ®4(x) are the primitive dth roots of unity in characteristic p. 
We now prove (a) => (b). Applying (11.9) with n = g — 1, we obtain 


xtt_l= Il Po(x). 


llq-1 


Since x?~' — 1 splits completely in F, and d|q— 1, we see that ©4(x) splits completely 
in F,. The implication (b) = (c) is trivial. Finally, to prove (c) => (a), note that a root 
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a € F, of ®,(x) is a primitive dth root of unity by the above analysis. Then d|q— 1 
(since @ has order d in Fj), and (a) is proved. 
The final assertion of the proposition now follows immediately. 7 


We next compute the irreducible factors of g(x). Since ged(d, p) = 1, we have 
[p| € (Z/dZ)*. Let m denote the order of {p] in this group. Writing [p]” = [1] as a 
congruence, we see that m is the smallest positive integer such that d|p” — 1. This 
number determines the degree of the irreducible factors of ®4(x) as follows. 


Theorem 11.2.7 Given d, let m be as above. Then ®q(x) is the product of @(d)/m 
irreducible polynomials in F, |x| of degree m. 


Proof: Let f be an irreducible factor of ®g(x). To show that deg(f) = m, it suffices 
to show that the smallest positive integer @ such that d|p* — 1 is = deg(f). We prove 
this as follows. For an integer @ > 1, observe that 


d|p°—1 < > ©,(x) splits completely over F,¢ 
<= f splits completely over Fe 
<=> f has a root in Fy 
<=> deg(f)|é. 


The first equivalence is by (a) < (b) of Proposition 11.2.6, the second follows from 
(b) < (c) of the same proposition because f|®4 (x), the third follows because F, C Fp» 
is Galois and hence normal, and the fourth is by part (c) of Proposition 11.2.1. These 
equivalences show that deg(/) has the desired property. . 


Here is an example of Theorem 11.2.7. 


Example 11.2.8 Let p = 2. Ford =5, one easily sees that [2] has order 4in (Z/5Z)*. 
By Theorem 11.2.7, @5({x) is the product of ¢(5)/4 = 1 irreducible polynomials of 
degree 4 in F [x]. Thus ®5(x) = x* +.x3 +x? +x+ 1 is irreducible in F2[x]. Its roots 
are the primitive 5th roots of unity in Fy. 

When d = 15, [2] also has order 4 in (Z/15Z)*. Thus © 5(x) is the product of 
(15) /4 = 8/4 = 2 irreducible polynomials of degree 4. In Exercise 6 you will verify 
that the factorization in F2[x] is 


®i5(x) = x8 4x74 4x44 ted = (xt4x74:1)(x4 4x41). 
The roots of these polynomials are the primitive 15th roots of unity in Fy. <b> 


C. Berlekamp’s Algorithm. We conclude this section by explaining how to 
determine whether a given nonconstant polynomial f € F,[x| is irreducible. One 
method for doing this would be to list the finitely many nonconstant polynomials 
of degree < deg(f) and divide each into f. We will give a much more efficient 
method based on Berlekamp’s factoring algorithm that uses a nice combination of 
linear algebra and the division algorithm. 
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Suppose that we want to test whether a given polynomial f € F, [x] is irreducible. 
We may assume that f has degree n > 1. Furthermore, since irreducible polynomials 
over F,, are separable by Proposition 11.2.1, we may also assume that f is separable. 
We will use the quotient ring 


R=F,|x]/(f). 
By the division algorithm, every element of the ring R can be written uniquely in the 
form 
ag tayx+-+-+a, ix" '+(f), a Fy. 


It follows that as a vector space over F,, R has dimension 7. Now consider the map 
T:R—>R 


defined by T(g+ (f)) = g? + (f). This is well defined, for if g-+(f) =A+ (f), then 
g =h+ fB for some B € F,, [x]. Since we are in characteristic p, we have 


g? = (h+ fB)? =hP + fPBP = hP + f- fP—' BP, 


which implies g? + (f) = h? + (f). Furthermore, it is easy to see that T is linear over 
F, (you will prove this in Exercise 7). The identity map lz : R — R is also linear. 
Then we get the following unexpected result. 


Theorem 11.2.9 Let f € F,|x] be separable of degree n > 1, and let R = F,|x]/(f). 
Then f is irreducible if and only if the linear map T ~\z:R— R has rank n— 1. 


Proof: If f is irreducible, then R is a field and T is the Frobenius automorphism. 
Hence the kernel of T — 1 consists of the solutions of a? = a. This gives F,, so that 
the kernel has dimension |. By the dimension formula from linear algebra, it follows 
that T — lp has rankn—1. 

On the other hand, if f is reducible, then f = gh where g,h € F,[x] have degree 
< deg(f). Furthermore, g and h must be relatively prime, since f is separable. Hence 
we can find A,B € F,, [x] such that Ag+ Bh= 1. 

Observe that Ag+ (f) is in the kernel of T — 1p if and only if (Ag)? —Ag isa 
multiple of f. Using the binomial theorem and f = gh, we have 


(Ag)? = Ag(1 —Bh)?~! = Ag(1—(p—1)Bh +--+ (—1)?"'(Bh)P") 
= Ag — gh-(p—1)AB+---+.gh-(—1)?'AB?—! he? 
= Ag mod f. 


It follows that Ag + (f) is in the kernel. Interchanging the roles of Ag and Bh, we see 
that Bh + (f) is also in the kernel. 

If we can show that Ag + (f) and Bh + (f) are linearly independent elements of 
R, then T — 1p will have rank at most m — 2 and the theorem will follow. So suppose 
that some linear combination of these cosets is zero, i.e., there are a,b € F, such that 
aAg + bBhis a multiple of f = gh. Then 


aAg+ bBh= ghC 
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for some C € F, [x]. Since g and A are relatively prime, it follows easily that g|bB and 
h|aA. But Ag + Bh = 1 implies that ged(g,B) = gcd(h,A) = 1, so that we must have 
a=b=0. This proves the desired linear independence. . 


One useful observation is that from the vector Ag + (f) in the kernel of T — lp, 
we can recover the factorization of f, since g = gcd(f,Ag) follows from f = gh and 
Ag+ Bh= 1. In general, elements of the kernel can be more complicated, but it is still 
possible to use them to find the irreducible factorization of f. This is Berlekamp’s 
algorithm, which is described in [13, Sec. 4.1]. 

Here is an example to show how Theorem 11.2.9 can be used. 


Example 11.2.10 Let f = 2° +x4+1 €F,[x]. Note that f is separable, since 
gcd(f,f’) = 1. Then R = F,[x]/(f) is a vector space over F, of dimension 5 with 


basis 1+ (f),x+ (f),x? + (f),x7 + (f),x4+ (f), which for simplicity we write as 
1,x,x7,x3 x4. 


Note that T : R > R is the squaring map, since p = 2. To compute the matrix of T, 
we apply T to each basis element and represent the result in terms of the basis: 
11, 
xO x, 
rH x4, 
PHxe=] +x+x', 
oa ltextr? tah tit. 
Here, x© = 1+x-+x* means that 1 -+x+.x* is the remainder of x® on division by 


f =2°+2x4++ 1, and similarly for the last line. 
It follows that the matrix of T — 1p with respect to the basis 1,x,x?,x?,x* is 


1001 1 100 0 0 00011 
00011 0100 0 01011 
0100 i17-/]0 01 0 07=7)70 11021 
0000 1 00010 00011 
00111 00001 00110 


(remember that we are in characteristic 2). One easily sees that this matrix has rank 
at most 3, since the first column is zero and the sum of the last three columns is zero. 
Since 3 < 4 = deg(f) — 1, Theorem 11.2.9 implies that f is reducible. <I> 


Historical Notes 


In the early 1800s Gauss showed that the number of monic irreducible polynomials 
in F,,|[x] of degree n is given by the formula 


(11.10) m= 2("- Sop op Sop +), 
a ab 


abe 
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where >°,, is the sum over all distinct primes dividing n, }°,, is the sum over all 
products of distinct primes dividing n, and so on. In Exercise 8 you will show 
that (11.10) is equivalent to the formula given in Theorem 11.2.4. Schonemann 
discovered (11.10) independently in 1846. He and Gauss also knew the factorization 
of x?" —x given in (11.7). 

It is also possible to count the number of monic irreducible polynomials in F, [x] 
of degree n. In Exercise 9 you will prove the analogs of Theorems 11.2.2 and 11.2.4 
for arbitrary finite fields. 

The Berlekamp algorithm is much more recent. The relation between irreducibility 
and the rank of T — 1g givenin Theorem 11.2.9 is due to Butler [4] in 1954. He proved 
more generally that if f € F,|[x] is separable of degree n > 0, then the rank of T — 1p 
is n—k, where k is the number of irreducible factors of f in F,[x] (see Exercises 10 
and 11). This theorem can be generalized to any finite field. In 1967 Berlekamp 
[2] rediscovered Butler’s result and used it as the basis for his factoring algorithm. 
Some of the beginning steps of his method will be discussed in Exercise 12, and the 
details can be found in [13, Sec. 4.1]. Berlekamp’s algorithm works best for small 
finite fields; other methods are used for larger ones. 

We refer the reader to Chapters 3 and 4 of [13] for much more on the mathematics 
and history of polynomials over finite fields. See also [5]. 


Exercises for Section 11.2 


Exercise 1. Let f € F [x] be irreducible, where F is a finite field. Prove that f is separable. 


Exercise 2. This exercise concerns Theorem 1 1.2.2 and the factorization (11.7). 
(a) Compute N3 and N4 using only Theorem 11.2.2. 
(b) Write down the factorization (11.7) explicitly when p” = 4 and 8. 


Exercise 3. Use Theorem 11.2.4 to compute No and N36. 


Exercise 4. In Theorem 11.1.4 we used splitting fields to show that a field of order p” exists 

for any prime p and integer n > 1. When Galois and others considered this question in the 

nineteenth century, their approach was to prove the existence of an irreducible polynomial in 

F, [x] of degree n. In other words, they needed to prove that N, > 0. 

(a) Prove that N, > 0 using Theorem 11.1.4. 

(b) Suppose that we have proved Theorem 11.2.4 but not Theorem 11.1.4. Use this to prove 
that N,, > 0. 


Exercise 5. Let F be a field of characteristic p, and let a € F be a root of unity. Prove that 
there is some d > | relatively prime to p such that a is a dth root of unity. 


Exercise 6. This exercise is concerned with Example 11.2.8. 
(a) Show that Ns = 3 when p= 2. Then write down these three irreducible polynomials 
explicitly. 
(b) Verify the factorization of ®15(x) given in the example. 
(c) Show that the roots of x4 -+.x°+ 1 and x*+.x+ 1 are the reciprocals of each other. 


Exercise 7. As in the discussion of Berlekamp’s algorithm, let R = F,|x]/(f) and consider 
the pth-power map T : R — R. Prove that T is a linear map when R is regarded as a vector 
space over F,. 
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Exercise 8. Prove that Gauss’s formula (11.10) is equivalent to the formula given in Theo- 
rem 11.2.4. 


Exercise 9. State and prove analogs of Theorems | !.2.2 and 1 1.2.4 that count monic irreducible 
polynomials of degree n in F, [x], where q is now a power of the prime p. 


Exercise 10. Suppose that amonic polynomial f € F,[x] has a factorization f = fi --- fi, where 
fi,.--,f; are distinct monic irreducible polynomials. Let R = F,[x]/(f), and let Ri = F,[x]/(fi) 
fori=1,...,k. Then consider the map 


p:R—>R,x-->x Ry 


defined by 
plgt (f)) = (gt (fi),.-..8+ (fd). 

The goal of this exercise is to prove that is a ring isomorphism when we make R; x --- x Rx 
into a ring using coordinatewise addition and multiplication. 

(a) Prove that ¢ is a well-defined ring homomorphism. 

(b) Prove that ¢ is one-to-one. 

(c) Show that R and R; x --- x R, have the same dimension when considered as vector spaces 

over F,. 
(d) Use the dimension theorem from linear algebra to conclude that ¢ is a ring isomorphism. 


Exercise 11. In the situation of Theorem 11.2.9, let T : R —> R be the pth-power map, where 
R=F,[x]/(f) and f is separable of degree n. The goal of this exercise is to prove that the 
rank of T — 1p is n—k, where k is the number of irreducible factors of f in Fp{x]. We will use 
the isomorphism ¢ : R ~ R’ = R; x --- x R; constructed in Exercise 10. 

(a) Let T’ : R’ — R’ be the map that is the pth power on each coordinate. Prove that y 

induces an isomorphism between the kernel of T — 1x and the kernel of T’ — Ip. 
(b) Prove that the kernel of T’ — 1g: has dimension k as a vector space over F,. 
(c) Prove that T — lpr has rank n — k, and use this to give another proof of Theorem 11.2.9. 


Exercise 12. Let f € F,{x] be monic and separable of degree n > 1, and assume that T — le 
has rank #n— 1. By Theorem 11.2.9, f is reducible. In this exercise, you will use the kernel 
of T — lr to produce a nontrivial factorization of f. 

(a) Show that the constant polynomials in F,[x] give a one-dimensional subspace of the kernel 
of T -1 R- 

(b) Prove that there is a nonconstant polynomial h € F,[x] of degree <n such that f|h? —h. 
Parts (c), (d), and (e) will use / to produce a nontrivial factorization of f. 

(c) Explain why h? —h =], cp, (A — a) in Fp [x]. 

(d) Use parts (b) and (c) to show that f = Tce, gcd(f,h—a). 

(e) Usedeg(h) < nto show that f{gcd(f,—a) when a € F,. Conclude that the factorization 
of part (d) is nontrivial, i.e., ged(f,#— a) is a nonconstant factor of f of degree <n for 
at least two a € Fy. 

The basic idea of Berlekamp’s algorithm is that one can factor f into irreducibles by taking 
the gcd’s of the nontrivial factors gcd(f, — a) produced by part (e) as we vary h and a. 


Exercise 13. Consider the polynomial f = x° +.x4+x+1 € Fo[x]. 
(a) Use Exercise 1] and the method of Example | 1.2.10 to show that f is the product of three 
irreducible polynomials in F2[x]. Also find a basis of the kernel of T — 1p. 
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(b) One element of the kernel is (0,0,1,1,0,1). This corresponds to A = x? +x? +.x°, since 
we’re using the basis of R given by the cosets of 1,x,...,x°. Show that gcd(f,h) and 
gcd(f,4+ 1) give a nontrivial factorization f = gig as in Exercise 12. 

(c) Pick an element A’ of the kernel not in the span of 1 and kh. Compute gcd(gi,h’) and 
gcd(gi,h’ +1) for i= 1,2. 

(d) Part (c) should show that f is a product of three nonconstant polynomials. Why is this 
the irreducible factorization of f? 


Exercise 14. In this exercise we will count the number of primitive elements of the extension 
F, C Fp. This is the number 


Py = {a € Fp | Fn = Fy(a)}|. 


(a) Use Corollary 11.1.8 to prove that p" = 57,1, Pm. 


(b) Use the Mobius inversion formula to conclude that P, = >7,,,, H(m) p. This formula 
was first proved by Dedekind in 1857. 
(c) Explain how the formula of part (b) relates to Theorem 11.2.4. 


Exercise 15. This exercise will illustrate how the word “primitive” is sometimes overused 
in mathematics. In the previous problem, we computed the number of primitive elements of 
F, C F. In this problem, we consider the primitive roots of F, which are generators of 
the cyclic group Fj. The minimal polynomial over F, of a primitive root of F,» is called a 
primitive polynomial for Fx. These are the minimal polynomials of the primitive (p” — 1)st 
roots of unity in characteristic p. 

(a) Prove that F» has @(p” — 1) primitive roots, where ¢ is the Euler ¢-function. 

(b) Prove that every primitive polynomial for F, has degree n. 

(c) Prove that the product of the primitive polynomials for Fx is ®y»—1(x). 


Exercise 16. Consider the trinomial f = x’ + x°+ 1 € F2[x], where r > s > 0 and ris prime. 
Prove that f is irreducible over F2 if and only if f x” —x. If in addition r is large and f is 
primitive in the sense of Exercise 15, then one can use f to make a pseudo-random number 
generator that takes a long time to repeat itself. For example, x43!'260 + 71078848 1 1 is g 
primitive trinomial of large degree. See [3] for more details. 


Exercise 17. In Section 4.2, we used the SchOnemann-Eisenstein criterion to prove that 
,(x) =x?" +-+-+x+4 1 is irreducible over Q, where p is prime. Here is a very different 
proof. We know that primitive roots modulo p exist. By Dirichlet’s theorem on primes in 
arithmetic progressions, it follows that there is a prime @ such that [é] € (Z/pZ)* has order 
p—1. Prove that ®,(x) is irreducible modulo @ and conclude that it is irreducible over Q. 
This argument is due in Schonemann in 1845 (see [5]). 
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PART IV 


FURTHER TOPICS 


The final four chapters of the book sample the further riches of Galois theory. 

Chapter 12 explores the history of Galois theory. We begin with Lagrange, who 
studied an important special case. We then explain how Galois thought about his 
theory and discuss Kronecker’s approach to this subject. 

Given an arbitrary polynomial, how do we find its Galois group? In Chapter 13 
we show that in principle this can always be done. We also explore various more 
efficient methods for dealing with polynomials of small degree. 

Chapter 14 continues our study of solvability by radicals. We give Galois’s 
wonderful criterion for when an irreducible polynomial of prime degree is solvable 
by radicals. Then, following Galois, we consider irreducible polynomials of prime- 
squared degree. This requires a careful study of the theory of permutation groups. 

Finally, Chapter 15 discusses Abel’s theorem on straightedge-and-compass con- 
structions on the lemniscate. This involves some truly wonderful mathematics. In 
particular, we use certain elliptic functions constructed from the lemniscate to create 
extensions of Q(i) with Abelian Galois groups. 


CHAPTER 12 


LAGRANGE, GALOIS, AND 
KRONECKER 


This chapter will explore the contributions to Galois theory made by Lagrange, 
Galois, and Kronecker. Our account of these great mathematicians will touch on the 
high points of their work on the roots of polynomials. 

As you read this chapter, you will be asked to look back at numerous arguments 
in previous chapters. The goal is to gain a better understanding of where these 
arguments came from and to see how the basic concepts of Galois theory evolved. 


12.1 LAGRANGE 


As we noted in the Historical Notes to Section 1.2, Lagrange’s 1770 treatise Réflexions 
sur la résolution algébrique des équations studied the known methods for solving 
equations of degree < 4 and analyzed these methods using permutations. Lagrange’s 
hope was that these methods could be adapted to equations of degree > 5. This 
section will discuss some of Lagrange’s ideas and explain why his approach was 
doomed to failure for degree > 5. 

Lagrange’s goal in 1770 was to understand the roots of an arbitrary polynomial. 
However, when dealing with expressions in the roots, Lagrange was concerned 
“only with the form” of such expressions and not “with their numerical quantity” 
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[Lagrange, p. 385]. In modern terms, this means the following. Given a field F 
and roots a1,...,Q@, of a polynomial f € F[x] of degree n, then for Lagrange, an 
“expression” in the roots is a quotient of the form 


A(a,.--,Qn) 
B(ay,..., 0)? 
where A,B € F|x,,...,X,]| are polynomials in variables x),...,x, with coefficients in 


F. Hence the “form” of this expression is the rational function 


A(x1,...,%n) 


=~ EF wee . 
B(Ey-%n) oe t) 


By using only the “form,” Lagrange is dealing with the case when the roots are 
variables x),...,%,. These are the roots of the universal polynomial of degree n, 


f= (era) (xm) Sx son pt (HL 


where o1,...,, are the elementary symmetric polynomials from Section 2.1. 
The coefficients of f lie in K = F(o),...,0,), and the splitting field of f over K 
is L = F(x,,...,X,). We will use the universal extension in degree n, 
KCL, 


throughout this section. We will also assume that F has characteristic 0 so that we 
can use the results on solvability by radicals proved in Chapter 8. Theorem 6.4.1 
shows that K C Lis a Galois extension with Galois group 


(12.1) Gal(L/K) ~ Sn, 


where o € S, gives the automorphism that sends a rational function y € L to the 
rational function o - y obtained from ¢ by replacing x; with x,,;) for each i. We are 
thus in a rich mathematical context: We know explicitly how the Galois group acts, 
we have the Galois correspondence, and we understand solvability by radicals. 

Lagrange, on the other hand, was working 60 years before Galois and 150 years 
before Artin formalized the Galois correspondence. Lagrange’s main tools were the 
following results from the theory of symmetric functions: 


e A polynomial in F[x),...,x,] that is unchanged by all permutations in S,, lies in 
F[o4,.--,On]- 
e A rational function in L = F(x),...,x,) that is unchanged by all permutations in 


S, lies in K = F(o1,...,0,) (i.e., K is the fixed field of (12.1) acting on L). 
We proved the first bullet in Theorem 2.2.2 and the second in Exercises 7 and 8 from 
Section 2.2. Using these facts, Lagrange discovered several important parts of Galois 
theory, even though the concept of “group” didn’t exist in 1770. Lagrange didn’t 
even have Cauchy’s efficient notation for expressing permutations. 

We will now explore some of what Lagrange did in his Réflexions. 
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A. Resolvent Polynomials. Fix a rational function y € L = F(x),...,%,), and 
consider the rational functions o - y for all o € S,. Let 


(12.2) PI = Pr P25--- Pr 


be the distinct rational functions we get in this way. The polynomial 


r 


(12.3) 6(x) =] [(x-¢) 


i=l 


with roots y1,...,y, 1S a special case of the polynomial (7.1) used in the proof of 
Theorem 7.1.1. Reread the proof of Theorem 7.1.1, especially the part where we 
show that (7.1) is the minimal polynomial of an element in a Galois extension. Be 
sure you understand how (7.1), applied to y € L, gives the polynomial 6 defined 
above. It follows that @ has coefficients in K = F(c1,...,0,), is separable, and is the 
minimal polynomial of y over K. Hence we have proved the following. 


Proposition 12.1.1 The polynomial 6 from (12.3) lies in K[x] and is separable and 
irreducible. a 


We call @ the resolvent polynomial of y. Hence the “Galois” construction of 
minimal polynomials given in (7.1) is actually due to Lagrange. In Exercise 1 you 
will follow Lagrange by proving that 6 € K[x] using the second of the above bullets. 

Here is an example of a resolvent polynomial from Chapter I. 


Example 12.1.2. Letn = 3 and z; = }(x1 +w?x2+wx3), w = e?"/3, You will check 
in Exercise 2 that $3 acting on z; gives the six elements z,, 22 = (23)-z1, WZ, W22, 
wz), Wz, listed in (1.10) and that the resulting resolvent is 


O(z) = (z—21)(z—22)(z— wzi)(z—- wz) (z— w24) (z — w22) 
= 2°42) — p?/27, 


where 
_ a? _ 203 0102 
rr in an ar ae 
These formulas become identical to those derived in (1.2) and (1.4) of Section 1.1 if 
we use x? + bx* + cx+d in place of x3 — o4x? + 02x — 03. <P 


One of Lagrange’s main goals was to replace the clever substitutions used to derive 
Cardan’s formulas in Section 1.1 with the following systematic process: 


@ Pick z; € L = F(x),x2,x3) whose resolvent 6(z) is easy to solve. 
e Express the roots x),x2,x3 in terms of the roots of the resolvent. 


Example 12.1.2 does the first step of this process, since 0(z) = 2° + gz} — p3/27=0 
can be solved by the quadratic formula. The second step is equally easy, since 


zy = 4 (xy +w2x2 + WHX3), 


2= 4 (xi +wXx2 +wx3), 
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together with | +w+w* =O and o) = x; +x) +23, imply that 


x= io} +2 4+ 22, 
x2 = io} +wz+w2, 
B= joy +uz +W 22. 
Comparing this with (8.15) from Section 8.3 reveals that our “Galois” approach to 


Cardan’s formulas is virtually identical to the “Lagrange” approach just described. 
The major difference is that in (8.15), we used the Lagrange resolvent 


Oy =X tw! xy +w 2x3 =X, +wx, +WX3 


rather than z; = i (x1 + wx) +w4x3) as above. The name “Lagrange resolvent” is no 
accident, as we will see later in the section. 
Here is another example of a resolvent polynomial. 


Example 12.1.3 Consider yj = x1x2 +.13x4 € L = F(x),%2,%3,4). In Exercise 3 you 
will show that the action of S4 on y, gives the three polynomials 


Yi = XyXQ+X3X4, V2 = HApAZ + HQX4, = V3 = A Xq + 1243, 
and that the corresponding resolvent, as a polynomial in y, is 
O(y) = (y — (x1x2 + x3x4)) (y — (x13 + x2x4)) (y — (x14 +2%3)) 
=y—oy*+(o103— 4oa)y— o3 - oro4 + 40204. 
This will be useful later in the section when we solve the quartic equation. <p 


In Example 12.1.3, note that when the 24 permutations o € S4 are substituted 
into o-y,, the result is always one of the three polynomials y;,y2,y3. Lagrange 
made the crucial observation that this happens because many permutations leave 
y) = X 1X2 +.X3x4 unchanged. From the modern point of view this is best stated using 
the language of group actions. Group actions are discussed in Section A.4 and have 
been used in several places in the text, most prominently in the proof of Theorem 6.4.1 
in Section 6.4. 

In terms of group actions, we can describe what Lagrange did as follows. In the 
proof of Theorem 6.4.1, we showed that S,, acts on L = F(x,...,,). For our chosen 
rational function ~ € L, we have the orbit 


Sn 9 = {o-9|o € Sn} = {1 = 9, 25--- Pr}; 
where the last equality uses the notation of (12.2). We also have the isotropy subgroup 
H(¢) = {a €S,|o-p = 9}. 
Since every o € H(¢) satisfies o -p = —, we can write this symbolically as 


A(y) p=. 
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Now consider y; in the orbit of y. This implies that y; = 0;-y for some oj; € Sy. 
One easily sees that o - py = vy; for every o € o;H(y) (be sure you can show this). As 
above, we write this symbolically as 


(12.4) oiH (py) -p = wi. 


In this way, we partition S, into r cosets of H(y). Since each coset has |H(y)| 
elements, we conclude that 
Sn] = rlA(y)I- 


Although he didn’t use this terminology, the above partition is implicit in what 
Lagrange did. Hence Lagrange in essence proved the following: 


© |H()| divides the order of S,. Thus the index 


(Sy:H(y)) = Sel = 


IZ(y)| ACY) 
is an integer. This is a special case of Lagrange’s Theorem. 
e r=|S,|/|4(~)| = [S,:4(y)]. Thus the number of elements in the orbit of y is the 
index of the isotropy subgroup H(y). This is a special case of the Fundamental 
Theorem of Group Actions from Appendix A.4. 


These results are more than just special cases: They represent the first time these issues 
were considered in mathematics. The name “Lagrange’s Theorem” was chosen in 
honor of Lagrange’s analysis of this situation. 

On the other hand, the details of Lagrange’s arguments are quite different from 
ours. To see how he approached these matters, we need to think in terms of resolvent 
polynomials. For this purpose, consider the polynomial of degree n! defined by 


@(x) = [] @-0-9). 


o€éS, 


To compare this with the resolvent 


we organize the product formula for © according to cosets of H(y) in S,. The key 
observation is that (12.4) and |o,#(y)| = |H(~)| imply that 


c€oiH(y) 


Since this holds for i= 1,...,r, we obtain the following theorem of Lagrange. 


Theorem 12.1.4 Given y € L = F(x),...,X,), the polynomials © and 6 are related 
by the equation 
Q(x) = O(x) FO), 
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In particular, the degree of the resolvent polynomial 6 is the index 
[Sal n! 


H(A) 


Here is an example to show how this result can be used. 


[Sn :A(¢)| 


Example 12.1.5 Example 12.1.3 shows that the resolvent of y, = x1x2 +.x3x4 has 
degree 3. Thus, by Theorem 12.1.4, the isotropy group H(y,) has cd = 8 elements. 
Also note that H(y,) contains (12) and (1324). In Exercise 3 you will show that 


H(y1) = (12), (1324)) C So 
and that H(y,) is isomorphic to the dihedral group Dg of order 8. <P 


It is fun to read Lagrange’s statement of Theorem 12.1.4: 


One can show in the same manner that, if the function 


Fa)" 0"). ] 


is by its own nature such that it conserves the same value when two, or three, or 


a greater number of different permutations are made among the roots x’, x", x", 


x'’, ..., the roots of the equation © = 0 will be equal three by three, or four by 
four, or etc.; so that the quantity © will be equal to a cube 9°, or a square-square 
6, or etc., and consequently the equation © = 0 will reduce to that of 6 = 0, 


whose degree will be equal to @/3, or to @/4, or etc. 


(See [Lagrange, pp. 370-371].) Here, w is Lagrange’s notation for n!, and his f is 
our y. In this statement, Lagrange says that if f is fixed by 2, 3, etc. permutations, 
then the resolvent has degree a, ae etc. At first glance, this seems wrong, for 
the denominator is one more than the number of permutations. The reason for the 
discrepancy is that Lagrange didn’t count the identity permutation. 

In most courses on group theory, students usually study cosets and Lagrange’s 
Theorem in one part of the course and group actions in another. Pedagogically, 
this makes sense, but it is also important to remember that historically, things are 
often more complicated. In considering resolvent polynomials, Lagrange had to deal 
with many issues all at once. It is a testament to his power as a mathematician that 
Lagrange could see what was important and thereby enable his successors to sort out 
the details of what he did. 


B. Similar Functions. In (Lagrange, pp. 358-359] Lagrange says that “one calls 
functions similar those that vary at the same time or remain the same when one makes 
the same permutations among the quantities of which they are composed.” In modern 
terms this means that y, € L = F(x,...,x,) are similar if for all o € S,, we have 


a p=9p —Sa-ype=y. 


Thus rational functions y and w are similar if and only if they have the same isotropy 
subgroup, i.e., H(y) = A(w). 
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Lagrange makes a careful study of similar functions, though his most amazing 
result concerns the more general situation where we have y, ~ € L with the property 
that ~ is fixed by every permutation that fixes y. In terms of isotropy groups, this 
condition can be written 

H(y) CH(¥) 


(be sure you understand this). Assuming we know y, can we determine which ~’s 
satisfy the above condition? Here is Lagrange’s remarkable answer. 


Theorem 12.1.6 Suppose that rational functions yp, € L = F(x\,...,Xn) have the 
property that » is fixed by every permutation fixing p. Then w is a rational function 
in —p with coefficients in K = F(01,...,0n). 


Proof: Our first proof of the theorem will use the Galois correspondence for K C L. 
Using y € L, we get the intermediate field K C K(y) C L. Then: 


e Under the group isomorphism (12.1), Gal(L/K(y)) C Gal(L/K) corresponds to 
H(y) C S,. Be sure you understand how this follows from Proposition 6.1.4. 

e By hypothesis, ~ is fixed by H(y), so that a is in the fixed field Lgat/x(y))- 

e By the Galois correspondence, K((p) = Lgaz/K(y)): 


Thus ¢ € K(vy), and hence y is a rational function in y with coefficients in K. 

Our second proof is taken from [3] and is much more in the spirit of Lagrange. 
As above, let y; = y,...,%, be the different rational functions obtained by letting 
S, act on y. Fix i between 1 and r. The proof of Theorem 12.1.4 shows that ¢; 
corresponds to some left coset of H(y), say o:H(y). Since H(y) doesn’t affect », 
it follows that every element of o;H() takes to ¥; = o;-w. In this way, we get 
elements | = #,...,%,, which need not be distinct (do you see why?). 

Using the vy; and ¥;, consider the function 


(12.5) B(x) = 60) ( kal ttre) = 9, Oy. gy, OO, 
x-Y X—, x-YI X—, 

where @(x) is the resolvent of y. In spite of the denominators, V(x) is actually 

a polynomial in x, since @(x) is by definition divisible by x—y,...,.x—y,. An 

element o € S, permutes the w;/(x — y;), so that #)/(x-— yi) +---+4,-/(x— 9) is 

unaffected by S,. Since 0(x) € K[x], the coefficients of Y(x) must be symmetric. 

Hence U(x) € K(x]. 

Next observe that if we evaluate the polynomial 


A(x) 
a) =T]-9)) 
i#i 
at ¢), then we get TTj-2(¢1 — yj) when i = 1 and 0 when 2 <i <r. Looking at the 
formula (12.5) for Y(x), we conclude that 


r 


(12.6) U(o) =v] [@i—¢) +0+---+0=H [I -¢,)- 


j=2 j=2 


322 LAGRANGE, GALOIS, AND KRONECKER 


However, 6(x) = (x—y)---(x— y,) and (5.7) imply that 


r 


(12.7) 8 (p1) = (G1 — ¥2) + (V1 — ¥) = [] (Gi - )). 
j=2 


Since y; = y and y = y, equations (12.6) and (12.7) give the equation 


v(y) 
(12.8) y= 
9’(y) 
This expression lies in K(y) since W(x) and 6’(x) have coefficients in K. 2 


One advantage of the second proof of Theorem 12.1.6 is that it gives an explicit 
formula (12.8) for expressing ~ in terms of vy, though in practice computing this 
formula can be unpleasant. While this proof differs from Lagrange’s, it uses yp; and 
y; in the same way, and Lagrange knew the formula (12.5), which is closely related 
to the Lagrange interpolation formula stated in Exercise 1 of Section 4.2. 

Here are some simple applications of Theorem 12.1.6. 


Example 12.1.7 Let L = F(x1,x2,x3,x4), where F has characteristic 4 2. The 
isotropy subgroup oft) = x, +x2 —x3 — 4 is (12), (34)) C Sg. Since y) = xyx2 +.x3x4 
is fixed by these permutations, we conclude that y; € K(t,). <> 


Example 12.1.8 Since fields of characteristic 0 are infinite, we can pick distinct 
elements t),...,f, € F. Now consider 


B=txyte--t+tyx, €EL= F(x,...,%n). 


Ifo € S,, then o-B =tXe(1) +++ +MhXo(n). Since t),...,f, are distinct, it follows 
that o - 3 = f if and only if o is the identity. Thus H(8) = {e}. This means that 
any w € Lis fixed by H(8) = {e}, so that w € K(8) by Theorem 12.1.6. Since » 
was an arbitrary element of L, we see that L = K(),i-e., 8 is a primitive element of 
L. Furthermore, H(3) = {e} and Theorem 12.1.4 imply that the resolvent of @ has 
degree n!. Thus 


Ay) = I] o--:4) = II (y= (Xa) +++ + InXe(n))) 


oS oESn 


is the resolvent of @. This will be useful in the next section. <p 


Lagrange was aware of Example 12.1.8, so that the idea of a primitive element 
dates back to the very beginnings of Galois theory. 

From the point of view of Galois theory, Theorem 12.1.6 is exciting in that it 
reveals a strong connection with the Galois correspondence. It gets even better when 
we bring in Lagrange’s similar functions. Here is the precise result. 
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Theorem 12.1.9 Let y,y € L= F(x,...,Xn). Then yp and w are similar functions 
if and only if K(p) = K(¥). 


Proof: We have K(y) = K(w) <=> Gal(L/K(y)) = Gal(L/K(q)) by the Galois 
correspondence. Using (12.1), this gives K(y) = K(w) <=> H(y) = H(w) (be sure 
you understand why). Since y and y are similar if and only if H(y) = H(w), the 
theorem follows. a 


This theorem shows that the intrinsic object corresponding to similar functions 
is the field they generate when adjoined to K = F(o\,...,0,). For us, it is natural 
to think in terms of fields. Lagrange, on the other hand, was writing before set 
theory was developed, so that he and his contemporaries tended to think of individual 
elements rather than the sets in which they lie. At the same time, Lagrange knew 
that individual functions y € L weren’t intrinsic, which is why he introduced similar 
functions. Taken together, similar functions and Theorem 12.1.6 show that Lagrange 
had an implicit understanding of the Galois correspondence for the extension K C L. 


C. The Quartic. After analyzing the solutions of cubic and quartic equations, 
Lagrange states his strategy for solving equations as follows [Lagrange, p. 355]: 
As should be clear from this analysis that we have just given of the main known 
methods for the solution of equations, all these methods reduce to the same 
general principle, namely to find functions of the roots of the proposed equation 
such that: 1° the equation or equations by which they are given, i.e., of which they 
are the roots (equations that are usually called reduced equations), happen to be 
of a degree smaller than that of the proposed equation, or at least decomposable 
into other equations of a smaller degree than this one; 2° the values of the desired 
Toots can be easily deduced from them. 


Here, “functions of the roots of the proposed equation” are elements y € L, and 
“reduced equations” are the corresponding resolvent polynomials. So Lagrange’s 
idea 1s to look for resolvent polynomials that either have smaller degree or factor into 
polynomials of smaller degree. 

To see what this means in practice, let us apply Lagrange’s methods to Ferrari’s 
solution of the universal quartic equation 


4 


x —o\x? +07x7 


—03x+o4=0. 
We first describe what Ferrari did (with some of the algebra left to the exercises). 
Write the above equation as 


x’ —-dO! x = 07x" +03X— 04. 
Since F has characteristic 0, we can add the quantity 


1 2 or a1 y? 

24 2(_ = 1),2-2 2 
yx +7 ( o\xt+y) (y+ S)x QIxt GZ 
to each side, where y is yet to be chosen. In Exercise 4 you will show that this leads 


to the equation 
2 


2 2 _ 
(12.9) (?-Sx+5) = (y+ 02)x?+ (Fy +o3)x4+7 — 04. 
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We next choose y so that the right-hand side of (12.9) is also a perfect square. The 
right-hand side is quadratic in x. In general, if A 4 0, then 


B\2 
Ax? +Br+C=A(x+57) <> BR? =4AC. 


Applying this to the right-hand side of (12.9), you will show in Exercise 4 that 
B? = 4AC leads to the cubic equation 


(12.10) yr —ooy?+ (a103 — 404) y— o3 = oro4 + 40204 = 0. 


This is the resolvent from Example 12.1.3 and is called the Ferrari resolvent. 
If y; is a root of this resolvent, then the above formula for Ax? + Bx+C shows 
that the right-hand side of (12.9) becomes 


2 =o) 2 
a +o 
eee 
2(y + F — 92) 
It follows that (12.9) can be written as 


2 2 =91 2 
(2 Zep %)*= (4-2 02) (x4 Et)" 
2 2 4 2( ) 


2 
yt Fa 
so that 
; yto 
(12.11) eo - Axt ety tt —or (x+ 
2(y.+ FG -o2) 


Solving these two quadratic equations in x gives the four roots x1,x2,%3,x4 of our 
quartic equation. This is Ferrari’s solution of the quartic. 

One of Lagrange’s main observations is that the auxiliary polynomials and radicals 
used in solving cubics or quartics come from expressions built from the roots and 
hence can be explained in terms of resolvent polynomials. For example, the Ferrari 
resolvent (12.10) is the resolvent polynomial of y; = x1x2 +x3%4 from Example 12.1.3. 
Exercise 5 will show how y; = x;x2 +.x3x4 follows from (12.11). 

We can also explain the square root in (12.11) in the same way. Using y; = 
1x2 +.x3x4 and setting 

ty =X) +x2—-x3 —X4, 
one checks that 
oa? _ 1? 
yt 47 o2>= r' 
This allows us to define 


o ty 
12.12 1 _gm=-. 
( ) yout 4. 72> 5 


LAGRANGE 325 


The isotropy subgroup of ft) is easily seen to be ((12),(34)), which means that 
its resolvent polynomial 6(t) has degree 6. By (12.12), t, is a root of the quadratic 
polynomial 

r _ (4y1 +o? — 4a) = r —4y, _ a + 402, 
which has coefficients in K(y,). To get a polynomial with coefficients in K, we use 
the other roots y2,y3 of the quartic resolvent (12.10). This gives 


(12.13) (?? —4y, — 07 + 402) (0? — 4y2 — of +402) (0? — 4y3 — of + 402). 


In Exercise 6 you will show that this polynomial lies in K[t] and hence is the resolvent 
A(t) since it has degree 6. 

The passage from Lagrange quoted at the beginning of our discussion of the 
quartic states that we need resolvent polynomials “of a degree smaller than that of the 
proposed equation, or at least decomposable into other equations of a smaller degree 
than this one.” In terms of what we just did for the quartic, this means the following: 


@ For y) = x1x2 +x3x4, the resolvent has degree 3, which is smaller than 4. So we 
can find y;,y2,y3 by Cardan’s formulas. 

@ Fort) =x; +2 —x3 — x4, the resolvent has degree 6, but since we already know 
y1,Y2,y3, we can decompose the resolvent into quadratics as in (12.13). Then we 
get ¢ in terms of y, by extracting a square root. 


Hence Ferrari’s solution can be seen as a special case of Lagrange’s strategy. 
We can also describe the above derivation in terms of fields and Galois groups. 
Example 12.1.7 shows that we have fields 


(12.14) KCK(y1) C R(t) CL. 


Since K C K(y,) has degree 3, we can use Cardan’s formulas to express the splitting 
field, and (12.12) shows that K(y,) C K(t,) is obtained by adjoining a square root. If 
we take the Galois groups of (12.14), then Gal(L/K) ~ S4 gives the subgroups 


H(t1) = ((12), (34)) C A) = (1324), (12)) C Sa. 


This differs from what we did in Chapter 8. There, we wanted a chain of subgroups 
where each was normal in the next larger. In contrast, H(y,) C S4 is not normal (see 
Exercise 3). Hence Lagrange did not follow a strictly “Galois-theoretic” approach 
to solving the quartic. However, the way he built up the solution using extensions of 
smaller degree shows that he had the beginnings of a theory of solvability. 

There is still more to say about the quartic, since the fields in (12.14) only give 
ty =x, +x2—2x3—x4. We need to explain how to go from here to the roots x1 ,x2,%3,%4. 
Rather than pursuing Ferrari’s solution (12.11), we will switch to Euler’s solution, 
which follows naturally from what we have done so far. 

The key idea is to work simultaneously with the extensions (12.14) and their 
conjugates. For K(y,), this means using K(y2) and K(y3). As for K(t,), the six 
conjugates of ft) are +f,,+f),+t3, where tf = (23) -t = x1 — x2 +23 —42x4 and t3 = 
(24) +t. = xy —x2—x3+.4%4. By (12.12), we know that 


t= 4/07 — 402+ 4y1, 
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and similarly one can show that 


th = /o7—402+4y2 and t3 = \/of — 402 +4y3, 


where y2 = (23)-y, and y3 = (24)-y,. We can thus express t,,f,t; in terms of 
radicals once we find y;, y2,y3 using Cardan’s formulas. 
In Exercise 7 you will show that the equations 
O1 =X 1x2 +43 +-X4, 
ty =X, 4+%2—-x3—- X4, 
(12.15) 1 1 +2 — x3 — 2X4 
tp =X, —xX24-%3—X4, 
ty =X1 —X2—-XZ+X4 
imply that 
= _(ot+htht+h), 
= 4(o1 +t —t—t3), 
= 4(01 -t t+h—%s), 


at Ae Bl 


X4 = (or —t)—t2 +13). 
Thus the x; can be expressed as sums involving three square roots. However, we 
can’t make independent choices of signs, since this would lead to eight values for the 
roots. The point is that t1,%2,3 satisfy the identity 


(12.16) ttt = 0} — 4002 + 803 


(see Exercise 8), so that knowing two of the square roots determines the third. Hence, 
if y1,y2,¥3 are the roots of the quartic resolvent (12.10), then the four roots of the 
quartic are 


1 
(1217) Z (ait V4y +02 -4024 4p +0} —4ort Vay + oF —4a2), 


where the + signs are chosen so that the product of the radicals is the right-hand side 
of (12.16). This is Euler’s solution of the quartic. 

Lagrange discusses other solutions of the quartic and interprets them in terms of 
resolvents. In general, his approach to solving equations anticipates many features 
of Galois theory, though there are important differences, especially in the appearance 
of nonnormal subgroups and the use of conjugate fields. 


D. Higher Degrees. Although Lagrange’s methods work wonderfully for equa- 
tions of degree < 4, they fail for degrees 5 and greater. One way to see this is by the 
theory of Chapter 8, which tells us that K C L is not solvable by radicals for n > 5, 
since Gal(L/K) ~ S, is not a solvable group for n > 5. 

It is also possible to describe this failure in terms of Lagrange’s strategy. Since 
the degree of a resolvent polynomial is the index of the isotropy subgroup, finding 
resolvents of small degree is equivalent to finding subgroups of S,, of small index. 
However, as soon as n > 5, such subgroups are hard to find, as we will now prove in 
the following theorem. 
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Theorem 12.1.10 Let n> 5. 
(a) IfH CS, is a subgroup of index [S,:H] > 1, then either H =A, or [S,:H] >n. 
(b) If H CA, is a subgroup of index (A,:H] > 1, then [A,:H] > n. 


Proof: To prove part (a), we first note that there is y € L whose isotropy group is 
precisely the subgroup H, i.e., H = H(i). You will prove this in Exercise 9. Then 
write the distinct rational functions of the form o - yp for o € S, as 


Pl = Py P2,-++5Pr- 
By Theorem 12.1.4 we know that r = [S, : H]. Now consider the set 
N={o€S,|o-y;=y; forall i=1,...,r}. 


In Exercise 10 you will show that N is a subgroup of S,,. Note also that every.o € N 
fixes y; = y, which implies that N C H. 

The key point of the proof is that N is a normal subgroup of S,. To prove this, 
we must show that 7~!or € N for allo € N andr € S,. Fix i between | andr. If 
T- pi = 9; for some j, then 7~! -p; = yj. Using o € N, this implies that 

(ror) giz (t7'0)-gp=r!-(o-y) =T7 BF = oH 
This is true for all i, so that r~!or € N. Thus N is normal in S,. 

Since N C H #S,, Proposition 8.4.6 implies that either N = {e} or N= H =A,. 
To complete the proof, we must show that NV = {e} implies [S,: H] > n. 

We will show that N = {e} and r = [S,:H] <n lead to a contradiction. First 
observe that every 7 € S, permutes the y;. The number of y,’s is 7, so that they can 
be permuted in r! ways. Yet the number of 7’s is n!. Since r <n implies r! < nl, 
there must be 7, 4 7 in S, that give the same permutation of the y;. Thus 


TL Yi =T2° Yi for all i=1,...,7, 
which easily implies that 
7 'T1° Yi = Yi for all i= 1,...,7. 


Thus tT 'T1 EN, so that N # {e}, since 7; 4 72. This contradicts N = {e}. 
You will prove part (b) in Exercise 11. 7 


To see how this messes up Lagrange’s strategy, suppose that n > 5 and that 
the resolvent (x) of y € L has degree > 1. Since deg(@) = [S,:H()], part (a) 
of Theorem 12.1.10 tells us that either deg(@) > n or H(y) = Ay, in which case 
y € K(VA) by Theorem 7.4.4. Hence the only reasonable way to begin Lagrange’s 
strategy is to pick y = VA. But then continuing his strategy would entail finding 
a proper subgroup of A, of index < n (you will verify this in Exercise 12). Such 
subgroups don’t exist (this is part (b) of Theorem 12.1.10), so that Lagrange’s strategy 
fails forn > S. 
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E. Lagrange Resolvents. To see where Lagrange resolvents come from, recall 
that the solution of the cubic used 


1 2 2ni/3 
zy = 7 (41 +43 +W xX), w= ert/3 | 


and the above solution of the quartic used 
ty =Xy 4X2 — x3 — X44 =X + (-1)x3 + (—1)?x2 + (—1)3x4. 
Aside from the factor of i, both expressions involve the roots multiplied by roots of 


unity to increasing powers. Here’s how Lagrange says this [Lagrange, p. 356]: 


As to equations that do not exceed the fourth degree, the simplest functions that 
yield their solution can be represented by the general formula 


2M 


x tyx" +y°x fee fyb tH) | 


x',x"",x'",...,x) being the roots of the proposed equation, which is assumed 


to have degree jz, and y being a root different from 1 of the equation 
y*—-1=0 


In Lemma 8.3.2 of Chapter 8, we used the name “Lagrange resolvent” for such 
expressions. There, we wanted to show that Galois extensions of prime degree p are 
obtained by adjoining a pth root when the smaller field contains a primitive pth root 
of unity. Our main tool was the Lagrange resolvent (8.7): 


a = B+C-'a(B) +677 07(B) +--+ CFP D P18). 


You should reread the proof of Lemma 8.3.2, especially (8.7) and (8.8). 

We can apply the formula for a; to the extension K C L as follows. Let 0 = 
(12...n) € S, ~ Gal(Z/K), let 8 = x,, and let ¢ be an nth root of unity. Then 
replacing p with n in the above formula gives the Lagrange resolvent 


(12.18) Qj= xX +o +x, + 67% Gg? xy fee $ CTD Grrl 4 
. =X} $+ Cox + O73 fe HCTHMED yy 


This agrees with Lagrange’s “‘general formula.” One can prove the identity 
a+ 04 = a4, 
which easily implies that 0 = (12...) fixes 
O: = af = (xy + Cohan $03 Feet CoD)" 


The proofs are identical to the arguments used in the proof of Lemma 8.3.2. One 
can also show that if ¢~ is a primitive nth root of unity, then (12...) generates the 
isotropy subgroup of 0; = a7, ie., H(@;) = ((12...m)) (see Exercise 13). It follows 
that the resolvent polynomial of 6; has degree n!/n = (n — 1)!. Lagrange states this 
as follows (Lagrange, pp. 332-333]: 
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. one gets an equation in @ of degree 1.2.3... (ss — 1), whose roots are the 
values of @ that come from the permutations of the .— | roots x”, x”, ... ignoring 


the root x’. 


(As above, Lagrange uses yu instead of n for the degree.) In Exercise 14 you will 

work out how the final part of this statement relates to the proof of Theorem 12.1.4. 
We conclude with an unexpected property of Lagrange resolvents. Let n = p be 

prime and ¢, = e2™'/P_ Then (12.18) with ¢ = ¢,' gives the Lagrange resolvent 


(12.19) ay =x t Chap + Cia t i + CPi ay, 
where i= 0,1,...,p — 1. Ignoring ap = a1, we set 

0; =a? = (x1 + Chxot Cag +--+ OP inp)’, 1<is<p-l. 
Lagrange forms the polynomial 
(12.20) (x —1)-+*(x—Op-1) = xP! —T xP 2 4 UP BX PA Eee, 


so that T,U,X,... are the elementary symmetric polynomials of @,,...,@p-1. La- 
grange then asserts that “the coefficients T,U,X,... are each given by an equation 
of degree 1.2.3... (—2),” ie., degree (p — 2)! [Lagrange, p. 333]. 

By claiming that these resolvents have degree (p — 2)!, Lagrange is in effect 
saying that their isotropy subgroups have order p!/(p —2)! = p(p—1). In fact, one 
can prove that the isotropy group of T is the subgroup M, C S, consisting of the 
permutations 
where everything is interpreted modulo p. In the Historical Notes to Section 6.4, we 
showed that M, is isomorphic to the affine linear group AGL(1,F,). Furthermore, 
we will prove in Proposition 14.1.4 of Section 14.1 that M, is a maximal solvable 
subgroup of S,. So Lagrange essentially found a maximal solvable subgroup of Sp. 

Lagrange concludes that his strategy fails in degree 5 and that 

if the algebraic resolution of equations of degrees greater than four is not im- 


possible, then it must depend on some functions of the roots, different from the 
preceding. 


(See [Lagrange, p. 357].) In spite of this failure, it is impressive to see how far 
Lagrange got. His 1770 treatise is one of the great works of algebra. 


Historical Notes 


In the sixteenth century del Ferro, Tartaglia, and Cardan solved the cubic, and 
Ferrari (a student of Cardan) solved the quartic. This was followed by mathematicians 
such as Viéte, Hudde, Descartes, Tschirnhaus, Euler, and Bézout, who simplified and 
improved these solutions and found some entirely new solutions. Many of these 
methods are analyzed by Lagrange in his Réflexions. 
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The late eighteenth century was a time of active research on the roots of poly- 
nomials. Besides the work of Lagrange just discussed, we have Euler’s solution 
of the quartic, which appeared in his 1770 algebra text [10, pp. 282-288] (a nice 
discussion of Euler’s method can be found in [4, pp. 104-107]). Euler also found 
many examples of quintics that are solvable by radicals and gave an incomplete proof 
of the Fundamental Theorem of Algebra. In 1772 Lagrange used the methods of 
his Réflexions to fill most of the gaps in Euler’s proof—though, as mentioned in the 
Historical Notes to Section 3.2, his argument is still incomplete. 

Another important development in the early 1770s was Vandermonde’s Mémoire 
sur la résolution des équations. This paper covers much of the same material as 
Lagrange’s Réflexions, though Vandermonde’s approach is different from Lagrange’s. 
In particular, he considered permutations in more detail than Lagrange and understood 
how resolvents relate the action of permutations. He also used these methods to 
treat cubic and quartic equations and independently discovered Lagrange resolvents, 
though he didn’t pursue the general theory to the same depth as Lagrange. One way 
in which he went significantly beyond Lagrange was his treatment of the equation 
x" —]=0. For example, Lagrange notes that x!' — 1 reduces to solving a quintic, 
but Vandermonde actually solved the resulting quintic by radicals. This may have 
been part of what inspired Gauss to investigate x? — 1 = 0, p prime. His results are 
discussed in Section 9.2. 

Lagrange hoped to solve equations by finding functions of the roots that gave a 
resolvent of small degree. In this section, we learned that 


the degree of the the number of distinct the index of the 


resolvent of py € L values 0-y, 0 € Sy isotropy subgroup of y. 


In 1845, Cauchy studied “the problem of the number of values that can be assumed by 
functions,” which by the last equality means the study of the index of subgroups of S,. 
This was one of the important problems in the early history of group theory. The key 
result is Theorem 12.1.10, which we used to show the failure of Lagrange’s strategy. 
In modern terminology, here are some highlights of how we got from Lagrange to 
Theorem 12.1.10: 


e In 1799 Ruffini published a proof that the quintic is not solvable by radicals. His 
proof was hard to follow, but he did show that S; had no subgroups of index 3 or 4, 
which is part (a) of Theorem 12.1.10 for n = 5. He also proved the irreducibility 
of resolvent polynomials. 


e In 1815 Cauchy generalized Ruffini’s result by showing that the index of a sub- 
group H ¥ S, is either 2 or at least the largest prime < n. Cauchy used the word 
“index” to denote the number of values of a function, which is where the modern 
term “index” comes from. The same paper also proved that A, is generated by 
3-cycles. Cauchy also emphasized the importance of the identity permutation and 
introduced the two-row notation for permutations. 


e In 1824 Abel gave the first generally accepted proof that the general quintic is 
not solvable by radicals. He used Cauchy’s (and Ruffini’s) results on the index of 
subgroups of Ss. 
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In 1832 Galois defined a normal subgroup of a group and asserted that a noncyclic 
simple group has order at least 60. Note that |As| = 60. 


In 1845 first Bertrand and then Cauchy proved part (a) of Theorem 12.1.10, though 
their proofs are quite different from the one given here. Cauchy also introduced 
the cycle notation now taught in introductory courses in abstract algebra. 


e In 1869 Jordan defined the concept of simple group, and in 1870 he showed that 
A, is simple forn > 5. 


e In 1879 Kronecker proved Theorem 12.].10 using the simplicity of A, forn > 5. 
This is the proof used in the text. 


Further results on the index of subgroups in S, can be found in [15, pp. 138-139] or 
[16, p. 274, Note 120). 

The Historical Notes to Section 8.5 mentioned the work of Ruffini and Abel on 
the unsolvability of the quintic. Now that we know Lagrange’s Réflexions, we can 
get a better idea of what they did. Very roughly, both Ruffini and Abel tried to prove 
the unsolvability of the quintic by showing the nonexistence of the required resolvent 
polynomials, which in terms of group theory reduces to Theorem 12.].10 for n = 5. 
But this alone is not enough, for Lagrange’s theory only deals with rational functions 
of the roots. But suppose that there were formulas for the roots x, ... ,x5 that involved 
expressions like \/x; + ---+,/xs? Lagrange’s methods would no longer apply. So 
the first thing Abel had to do was prove that if the quintic were solvable by radicals, 
then one could write the solution entirely within L = F(x,...,x,), assuming that F 
had suitable roots of unity. This is discussed in [1]. See also [9] and [10] in the 
references to Chapter 8. 

Then comes Galois, whose work in the early 1830s is the main topic of the next 
section. The important thing to say here is that Galois’s analysis of solvability by 
radicals led to the concept of solvable group and gave a dramatically simpler approach 
to all of these questions. Namely, once one proves that S, is not solvable for n > 5, 
then one immediately concludes that the general polynomial of degree n > 5 isn’t 
solvable by radicals and that Lagrange’s strategy for the quintic must fail. Results 
like Theorem 12.1.10 are simply not needed. This shows the power of good ideas. 
And the fact that Cauchy was still pursuing Lagrange’s approach in 1845 shows how 
long it took to understand these ideas. 

Further comments on eighteenth-century algebra can be found in Chapter 6 of [2]. 
The book [Tignol] has chapters on the work of Lagrange, Vandermonde, Ruffini, and 
Abel. A good description of Lagrange’s Réflexions can be found in [11]. This paper 
also discusses the subsequent history of Galois theory. 


Exercises for Section 12.1 


Exercise 1. Let 6(x) be the resolvent polynomial defined in (12.3). Use the second bullet 
following (12.1) to show that 0(x) € K]x]. 
Exercise 2. Work out the details of Example 12.1.2. 


Exercise 3. This exercise concerns Examples 12.1.3 and 12.1.5. 
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(a) Compute the resolvent @(y) of Example 12.1.3. This can be done using the methods of 
Section 2.3. 

(b) Let y) = x1x2 +.x3x4. Show that H(y,) = ((12), (1324)) C Sq. 

(c) Show that H(y1) is not normal in Sq. 

(d) Show that H(y;) is isomorphic to Dg, the dihedral group of order 8. 


Exercise 4. Verify (12.9) and (12.10). 


Exercise 5. This exercise will study the quadratic equations (12.11). Each quadratic has two 
roots, which together make up the four roots x1,x2,x3,%4 of our quartic. 

(a) For the moment, forget all of the theory developed so far, and let y be some root of the 
Ferrari resolvent (12.10). Given only this, can we determine how y relates to the x;? This 
is surprisingly easy to do. Suppose x;,x; are the roots of (12.11) for one choice of sign, 
and x;,,x; are the roots for the other. Thus /, /,k,/ are the numbers 1, 2,3,4 in some order. 
Prove that y is given by y = xjxj + xxX1. 

(b) Now let y; = x1x2 +.x3x4, and define the square root in (12.11) using (12.12). Show that 
the roots of (12.11) are x;,x2 for the plus sign and x3,x4 for the minus sign. 

Historically, the Ferrari resolvent was just a tool for solving the quartic. Lagrange was the first 
to observe that the roots of (12.10) can be expressed in terms of the roots of the quartic. His 
argument (Lagrange, p. 262] is similar to what we did here. 


Exercise 6. Explain why the polynomial (12.13) has coefficients in K = F(o1,02,03,04). 
Exercise 7. Show that (12.15) implies the equations for x;,x2,x3,x4 given in the text. 


Exercise 8. Let t),#2,t; be defined as in (12.15). 

(a) Lagrange noted that any transposition fixes exactly one of t1, #2, fs and interchanges the 
other two, possibly changing the sign of both. Prove this and use it to show that fifats is 
fixed by all elements of Sq. 

(b) Use the methods of Chapter 2 to express t;f2f3 in terms of the o;. The result should be the 
identity (12.16). 


Exercise 9. Let H be a subgroup of S,. In this exercise you will give two proofs that there is 
y € Lsuch that H = H(y). 

(a) (First Proof.) The fixed field Ly gives an extension K C Ly. Explain why the Theorem 
of the Primitive Element applies to give y € Ly such that Ly = K(p). Show that this 
has the desired property. 

(b) (Second Proof.) Let m = xi! -+-x2" be a monomial in x1,...,%n with distinct exponents 
a,...,4n. Then define 


p= oo m= Pein eon: 
oeH ocH 
Prove that H(~) = H. 


Exercise 10. Prove that the subset N C S, defined in the proof of Theorem 12.1.10 is a 
subgroup of Sn. 


Exercise 11. Let H be a proper subgroup of A, with n > 5. Prove that [A, : H] > n. 


Exercise 12. The discussion following Theorem 12.1.10 shows that if we are going to use 
Lagrange’s strategy when n > 5, then we need to begin with » = VA, which has isotropy 
subgroup A,. Suppose that ~ € L is our next choice, and let 6(x) be the resolvent of y. Since 
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we regard K(/A) as known, we may assume that 7) ¢ K(v/A). The idea is to factor 0(x) over 
K(VAQ), say 0 = Ri --- Rs, where R; € K(WA)|[x] is irreducible. This is similar to how (12.13) 
factors the resolvent of t; over K(y:). Suppose that ~ enables us to continue Lagrange’s 
inductive strategy. This means that some factor of 6, say R;, has degree <n. Your goal is to 
prove that this implies the existence of a proper subgroup of A, of index <n. 

(a) Prove that deg(R;) > 2. 


(b) Since 6 splits completely over L, the same is true for Rj. Let w; € L be a root of Rj and 

consider the fields 
K CK(VA) CM=K(V4A,%)) CL. 

Let H; C S, be the subgroup corresponding to Gal(L/M) C Gal(L/K) under (12.1). Prove 
that H; C A, and that [A, : Hj] is the degree of Rj. 

(c) Conclude that deg(R;) < n implies that H; is a proper subgroup of A, of index < n. 

With more work, one can show that deg(R;) = [An: AnH (%)] for all i and that 

a ee 

[H():AnQH(p)] 


s= 


It follows that s = 1 or 2. 


Exercise 13. Let ¢ be a primitive nth root of unity, and let a = (x) + 6x2 +--° +0" an)”. 
Prove that H(a") = ((12...")) C Sy. 


Exercise 14. Let a; be as in (12.18). The quotation given in the discussion following 
(12.18) can be paraphrased as saying that the roots of the resolvent of 6; = af come from the 
permutations of the n — 1 roots x2,...,%, that ignore the root x;. What does this mean? 

(a) Show that each left coset of ((12...”)) in S, can be written uniquely as o((12...n)), 

where o fixes 1. 

(b) Explain how Lagrange’s statement follows from part (a). 
In general, we say that g1,...,8m € G are coset representatives of a subgroup H C G if 
giH,...,8mH are the distinct left cosets of H in G(so m= [G: H]). Thus Lagrange’s quotation 
gives an explicit set of coset representatives for ((12...2)) C Sp. 


Exercise 15. Given the Lagrange resolvents a1,...,@ )—1 defined in (12.19), the goal of this 


exercise is to prove that 
-1 
1  i(i- 
x= — (a1 +306 ii aj). 
p\ 4S 


(a) Write a; =>, CE) yy for 1 < j < p, so that a» = 01. Then show that 
GMa = SG 
j=) jf=l 

(b) Given an integer m, use Exercise 9 of Section A.2 to prove that 


scny = ff if m = 0 mod p, 
j=l 


0, otherwise. 


(c) Use parts (a) and (b) to prove the desired formula for x;. 


Exercise 16. Prove that Theorem 7.4.4 follows from Theorem 12.1.6 and Proposition 2.4.1. 


334 LAGRANGE, GALOIS, AND KRONECKER 


Exercise 17. In Theorem 12.1.9, we used the Galois correspondence to show that rational 
functions y and y are similar if and only if K(y) = K(#). Give another proof of this result 
that uses only Theorem 12.1.6. 


Exercise 18. Consider the quartic polynomial f = x* + 2x” — 4x42 € Qf]. 
(a) Show that the Ferrari resolvent (12.10) is y> — 2y? — 8y. 
(b) Using the root y; = 0 of the cubic of part (a), show that (12.11) becomes 


x =4+V-2(x-1) 


and conclude that the four roots of f are 


v2..1 [,_,. v2.1, / 
a7 it 3 —2-4iV2 and — 7 it 3 —24-4iV/2. 


(c) Use Euler’s solution (12.17) to find the roots of f. The formulas are surprisingly different. 
We will see in Chapter 13 that this quartic is especially simple. For most quartics, the formulas 
for the roots are much more complicated. 


Exercise 19. This exercise will prove a version of Theorem 12.1.10 for a subgroup H of an 
arbitrary finite group G. When G = S,, Theorem 12.1.10 used the action of S, on L and wrote 
H =H(v) for some ¢ € L. In general, we use the action of G on the left cosets of H defined 
by g-hH = ghH for g,h EG. 

(a) Prove that g-hH = ghH is well defined, i.e., hH = h'H implies that ghH = gh’H. 

(b) Prove that H is the isotropy subgroup of the identity coset eH. 

(c) Let m = [G:H], so that the left cosets of H can be labeled g:H,...,2mH. Then, for 
g &G, let o € Sn be the permutation such that g- g:H = g,(;)H. Prove that the map 
gt o defines a group homomorphism G > S,,. 

(d) Let N be the kernel of the map of part (c). Thus N is a normal subgroup of G. Prove that 
NCH. 

(e) Prove that [G: N] divides m!. 

(f) Explain why you have proved the following result: If H is a subgroup of a finite group G, 
then H contains a normal subgroup of G whose index divides [G : H]!. 

(g) Use part (f) and Proposition 8.4.6 to give a quick proof of Theorem 12.1.10. 


Exercise 20. Let G be a finite group and let p be the smallest prime dividing |G|. Prove that 
every subgroup of index p in G is normal. 


Exercise 21. Part (a) of Theorem 12.1.10 implies that when n > 5, the index of a proper 
subgroup of S, is either 2 or > n. 
(a) Prove that S, always has a subgroup H of index n. This means that equality can occur in 
the bound [S, : H] > n. 
(b) Give an example to show that Theorem 12.1.10 is false when n = 4. 


12.2 GALOIS 


In this section we will explore several aspects of Galois’s work. Our discussion will 
be based on his 1831 memoir on Galois theory, entitled Mémoire sur les conditions 
de résolubilité des Equations par radicaux. See [Galois, pp. 42-71] for the French 
original and [Edwards, pp. 101-113] for an English translation. 
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A. Beyond Lagrange. In Section 12.1 we saw that Lagrange studied the universal 
case where the roots are variables x;,...,x,. In contrast, Galois created a theory that 
applies to arbitrary polynomials. To see the difference, recall the quotation from 
Galois given in the Historical Notes to Section 7.1 [Galois, p. 51]: 


PROPOSITION I 


THEOREM. For a given equation, let a,b,c,... be the m roots. There is 
always a group of permutations on the letters a,b,c, ... that enjoys the following 
property: 


1° that every function of the roots that is invariant** under the substitutions 
of the group, is rationally known; 


2° conversely, that every function of the roots that is rationally determined, 
is invariant under these substitutions*. 


As noted in Section 7.1, this asserts that in a Galois extension F C L, the field F is the 
fixed field of the Galois group Gal(L/F). For our purposes here, the most interesting 
part of the proposition is the double asterisk **, which refers to the following marginal 
note in Galois’s manuscript [Galois, p. 50]: 

Here we call invariant not only a function whose form is invariant under the 

substitutions of the roots among themselves, but also those [functions] for which 

the numerical value does not vary under the substitutions. 


In Section 12.1 we saw that Lagrange’s concern was “only with the form” of ex- 
pressions and not “with their numerical quantity.” In this marginal note, we see that 
Galois is consciously going beyond Lagrange. We finally have a theory that applies 
to all polynomials, not just the universal one. 

The single asterisk * in the above quote will be discussed in the Historical Notes. 


B. Galois Resolvents. To understand the splitting field of a separable polynomial, 
Galois used a variation of Lagrange’s notion of resolvent polynomial. Suppose that 
f € F[x] can be written f = ao(x—a1)---(«—a,) in a splitting field L, where 
Qt1,...,;Q, are distinct. We also assume that F is infinite. Given t,,...,t, € F, 
consider the polynomial! of degree n! defined by 


(12.21) s(y) = II (y- (tae(1) +++++tnQa(ny))- 
o€S, 


The discussion following (5.4) in the proof of Proposition 5.2.1 in Section 5.2 shows 
that s(y) € F[y]. You should reread this argument, which uses symmetric functions 
and is similar to Galois’s. 

In this situation, Galois asserts that since a,,...,Q, are distinct, one can find 
t1,...,f, € F so that the n! elements 


NO a) te then), aESn, 


are all distinct. In other words, f),...,t, € F may be chosen so that s(y) is separable. 
When this happens, we call s(y) a Galois resolvent of f. Exercises | and 2 will 
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prove that such f,,...,%, exist. Galois uses the letter V to denote ta, +--+: +tQn. 
Following Lagrange, he refers to V as a “function of the roots.” 
Here is an example of a Galois resolvent. 


Example 12.2.1 We will compute a Galois resolvent of f = (x? — 2)(x? —3) = 
x4 — 5x? +6 € Q[x], which has roots 2, —V2, V3, and —/3. Let (t),t2,3,t4) = 
(0,1,2,4). In the notation used by Galois, 
V =0-V241-(—V2)+2-V3+4- (—v3) = -v2-2Vv3. 
Using Maple or Mathematica, one can compute that (12.21) gives 
s(y) = 731025000000 — 5765769000000 y” + 13335274350000 y* 
— 12343809230400 y° + 5171341381036 y®— 1110939359380y° 
+ 129730351909 y!? — 8413645990 y'4 + 308394211 y!® 
— 6392440 y!8 + 73339 y — 430 y? + 74. 


A computer calculation also shows that gcd(s(y),s’(y)) = 1, which implies that s(y) 
is a Galois resolvent of f. Factoring s(y) into irreducibles gives 


s(y) = (900 — 132y? + y*)(25 — 118 y? + y*)(361 — 70y? + y*) 
(36 — 60y? + y*)(100 — 28y? + y*)(25 —22y? + y*). 
Hence the Galois resolvent is reducible in this case. <p> 


As stated in [Galois, p. 49], the key property of V is the following: 


Lema III. If the function V is chosen as indicated in the preceding article, 
then it has the property that every root of the given equation [our f] can be 
expressed rationally as a function of V. 


This lemma says the roots a,...,@, lie in F(V). It follows easily that 
L=F(a,...,Q,) =F(V), 


since V = tay +---+t,Oq, € F(a1,..-,Qq) = L. Thus Lemma III implies that V is a 
primitive element of the splitting field of f over F. 

Galois’s proof of Lemma III is so terse that when Galois submitted his memoir 
to the French Academy in 1831, Poisson complained that the proof was insufficient 
but could be completed using Lagrange’s methods [Galois, p. 50]. A discussion of 
Galois’s proof can be found in [Edwards, §37] and [12]. In Exercise 3 you will use 
Lagrange’s methods to prove Lemma III. 

Let us compare Galois’s Lemma III with Example 12.1.8, where we considered 


B=tixyt---+tXn 


for the extension K = F(o1,...,0,) CL=F(x,...,%,). In Example 12.1.8, we used 
Theorems 12.1.4 and 12.1.6 to show that L = K(8) and that 


A(y) = Il (y- (tiXe(1) tee + trXo(n))) 


oS, 
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is the resolvent of 6. Thus 6(y) is irreducible over K by Proposition 12.1.1. So 
we have a primitive element @ and an irreducible polynomial 6(y) of degree n!. In 
Galois’s situation, we have V and s(y), and although V is a primitive element (by 
Lemma III), Example 12.2.1 shows that s(y) need not be irreducible. Thus, while 
some of Lagrange’s results apply to arbitrary polynomials (such as the construction 
of primitive elements), others do not (such as the irreducibility of resolvents). 

As we have defined things, V is a root of the Galois resolvent s = s(y) € Fly]. 
Since s can be reducible over F, we let h = h(y) € F[y] be the minimal polynomial 
of V over F. Then A is an irreducible factor of s. We will let m denote the degree of 
h. Note that A is separable, since s is. 

Galois makes the crucial observation that the roots of / interact with the roots of 
the original polynomial f as follows [Galois, pp. 49-51]: 


LEMMA IV. Suppose that one forms the equation of V [ours], and that one takes 
one of its irreducible factors, such that V is a root of an irreducible equation [our 
h). LetV,V’, V”, ... be the roots of this irreducible equation. If a = o(V) is 
one of the roots of the given equation [our f], then ¢(V’) will also be a root of 
the given equation. 


(In the original, Galois wrote a = f(V). We have changed f to @ because we use f 
for the given polynomial.) 


Proof of Lemma IV: Since L= F(V) contains the root a of f, we can write a = @(V), 
where ¢ € F[x]. Also note that by normality, h splits completely over L. This shows 
that V,V’,V”,--- EL. In particular, L contains the roots V and V’ of the irreducible 
polynomial h. By Proposition 5.1.8, we can find ¢ € Gal(L/F) such that o(V) = V’. 
(This proposition played a crucial role in our development of Galois theory, especially 
in the proof of Theorem 6.2.1.) Then Lemma IV follows immediately from 


where the fourth equality uses f,@ € F [x] and o € Gal(L/F). . 


Galois’s argument is different from ours and doesn’t mention automorphisms 
explicitly. But it should be clear that automorphisms and how they act on roots are 
implicit in the statement of Lemma IV. Galois’s proof of Lemma I'V is described in 
{Edwards, pp. 51-52]. 


C. Galois’s Group. We next explore how Galois defined the Galois group. He 
considered only splitting fields of separable polynomials. We will show that in this 
situation, Galois’s definition is equivalent to the one given in Section 6.1. 

Consider the splitting field L of a separable polynomial f € F [x]. As above, we 
assume that F is infinite. For us, the Galois group Gal(L/F) consists of automor- 
phisms of L that are the identity on F. However, Proposition 6.3.1 shows that we can 
interpret Gal(L/F) in terms of permutations of the roots of f. Thus we can think 
of Gal(L/F) as consisting of all permutations of the roots that come from automor- 
phisms. In other words, we consider only those permutations that preserve the field 
operations. Here is an example of what this means. 
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Example 12.2.2 Let L = Q(+V/2+ v2) be the splitting field of f = x+ — 4x7 +2 
over Q, and consider the permutation of the roots defined by 


V2+v2 2-Vv2, V2-v2H 24 v2, 
-V/24+V2 4H -\/24v2,  -/2-Vv2 H-/2-v2. 


This is not consistent with the field operations, since \/2+ V2 14 V2— V2 should 


imply that —-V2+ V2» -V2-¥2. Hence this permutation doesn’t come from 
an automorphism in Gal(L/Q). <> 


Galois did not use the notion of field automorphism. So how did he decide which 
permutations to use? His approach is based on the primitive element V and minimal 
polynomial / constructed above. We will use the following notation. Let 


ViVi"... Vern 


denote the roots of h. Furthermore, since L = F(V) by Lemma III, the roots a,...,@n 
of f can be written 
(12.22) PV), vi(V), 2(V),---,Pn—-1(V), 


where ~, (1, 2,---,;Yn—1 have coefficients in F. Then Galois describes his group 
as follows [Galois, p. 53]: 


No matter what the given equation [our f] is, one can find a rational function V 
of the roots such that all of the roots are rational functions of V. Given this V, 
let us consider the irreducible equation of which V is a root (lemmas III and IV) 
[our A]. Let V,V’,V”,..., V“"— be the roots of this equation. 

Let pV, viV, y2V,..., Ya—1V be roots of the proposed equation. 

Write down the following m permutations of the roots: 


(V), eV. piv, eV tats pn-1V, 
(V’), eV’, ev’, eV" weey vey 
(v"), pV’, giv”, gov”, sees tees 
eng wees eeey seey seey seey 
(Vim), pvr, pivird, gov"), sees gn—1V m2) 


I say that this group of permutations has the desired property. 


(In the original, n and m are interchanged. We have switched them in order to be 
consistent with the notation used here.) In this table, the first entry of a row is a label 
for the row, and the remaining n entries of the row are roots of f by Lemma IV. 

One complication is that for Galois, the word “permutation” has a different mean- 
ing than it has for us. We will discuss this in the Mathematical Notes below. For 
now, we will understand the above quote as saying that Galois’s “group” consists of 
the m permutations obtained by mapping y(V), yi(V), y2(V),-.--,Gn—1(V) to the 
m rows displayed in the quote. These permutations are related to the Galois group 
Gal(L/F) as follows. 
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Theorem 12.2.3 Let L be the splitting field of the separable polynomial f in F(x] 
and let the roots of f be denoted as in (12.22). Proposition 6.3.1 gives a one-to-one 
group homomorphism Gal(L/F) > S,. Then the image of this map consists of the m 
permutations described by Galois. 


Proof: Asabove, V isa primitive element of F C L, and his the minimal polynomial 
of V over F. The m roots of h will be denoted V =v, Vv) =v’, v@ =v",..., 
v'"—1)_ Note also that A is separable. Then the proof of Theorem 6.2.1 implies that 


Gal(L/F) = {o1,...,0m}; 


where 0; is the automorphism of L that takes the primitive element V to the root 
V-)) of h. As in the proof of Lemma IV, it follows that 


(12.23) oi(¥(V)) = 2(o,(V)) =¥(V—”) 


for any polynomial ~ with coefficients in F. 
In the homomorphism Gal(L/F) — S, from Proposition 6.3.1, 0; maps to the 
permutation that takes y(V), yi(V), y2(V),---;n—1(V) to 


ai(—(V)), ai(ei(V)), oi(p2(V)),---.0i(Gn—1(V)). 
Using (12.23), this can be rewritten as 
(12.24) eV), p(VEY), ox(VEY),... yr (VEM). 


One easily sees that (12.24) is the ith row displayed in the above quote since V, V’, 
Vv", ..., VO") are now VO, VO), y@, ..., v—")_ Hence the images of the 0; 
are the m permutations described by Galois. tT] 


This theorem shows that for the splitting field of a separable polynomial, the 
definition used by Galois is equivalent to Definition 6.1.1. The Historical Notes to 
Section 6.1 give a brief description of how we got from Galois’s group to the modern 
Galois group Gal(L/F’). A more detailed explanation appears in [11]. 


D. Natural and Accessory Irrationalities. Before explaining Galois’s strategy 
for solving equations, we need to discuss some classical terminology. Let F Cc L be 
the splitting field of a separable polynomial f € Fx]. Then adjoin a quantity 6 to F, 
where £ is a root of an auxiliary equation that we assume to be known. For example, 
8 could be a radical or a root of a resolvent equation. If 6 ¢ F, then we call Ba 
natural irrationality when 8 € L and an accessory irrationality when B ¢ L. 


Example 12.2.4 Let f € F(x] be solvable by radicals with splitting field F C L. 

e If F C Lis radical in the sense of Section 8.2, then we can obtain the roots of f 
by adjoining natural irrationalities. 

e If F c Lis solvable but not radical, then at least one of the radicals adjoined must 
be an accessory irrationality. <p 
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The quantity 6 ¢ F in the above discussion gives an extension F C K = F(). 
Then it is easy to see that 


8 is a natural irrationality <> K CL, 
B is an accessory irrationality <=> K ¢ L. 


When K ¢ L, we can assume that K and L are both contained in some larger field 
(see Exercise 4). Then we get a diagram 


(12.25) K L 


where KL is the compositum of K and L as in Definition 8.2.5. The relation between 
Gal(KL/K) and Gal(L/F) is described by the following result. 


Theorem 12.2.5 Suppose that we have a diagram (12.25) where F C L is a Galois 
extension and F C K is finite. Then K C KLis a Galois extension and the restriction 
map o ++ o|, defines an isomorphism 


Gal(KL/K) ~ Gal(L/K NL) Cc Gal(L/F). 


Proof: In Exercise 5 you will show that K C KL is Galois whenever F C Lis. By 
the results of Chapter 7, Gal(KL/L) C Gal(KL/F) is normal since F C L is Galois, 
and thus oL = L for all o € Gal(KL/F). Thus o gives a map 


o|,:L—L. 


Since a! |, is the inverse of o|, (see Exercise 6), o|, is an automorphism of L. 
Furthermore, o is the identity on F, which implies that the same is true for o|,. 
When restricted to Gal(KL/K) C Gal(KL/F), this gives a map 


(12.26) Gal(KL/K) —> Gal(L/F). 


You will show that this is a group homomorphism in Exercise 6. 

To see that (12.26) is one-to-one, suppose that o € Gal(KL/K) and o|, is the 
identity on L. Then a is the identity on both K and L, which easily implies that is 
the identity on KL (see Exercise 6). Thus (12.26) is one-to-one. 

Finally, we need to show that the image is Gal(L/K ML). For this purpose, let 
H c Gal(L/F) be the image of (12.26). By the Galois correspondence, it suffices 
to show that Ly = KML. First suppose a € Ly. Then a is fixed by all o|, € H, 
which means that a is fixed by all o € Gal(KL/K) and hence is in the fixed field 
of Gal(KL/K). Thus a € K since K C KL is Galois, and a € KML follows from 
a €Ly CL. This proves that Ly C KNL. 
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For the other inclusion, let a € KML. Then in particular, a is in K and hence is 
fixed by all o € Gal(KL/K). Thus o|, (a) = a for all o|, € H, which implies a € Ly, 
as claimed. This completes the proof. a 


Theorem 12.2.5 is sometimes called the Theorem on Natural Irrationalities. To 
see why, suppose that K ¢ L, i.e., K is obtained from F by adjoining accessory 
irrationalities. Then the isomorphism 


Gal(KL/K) ~ Gal(L/K ML) 


of Theorem 12.2.5 means that K C KLand KNL Cc Lhave the same Galois group. But 
KL lies inside L and hence is obtained from F by adjoining natural irrationalities. 
Thus, from the point of view of Galois theory, Theorem 12.2.5 implies that we don’t 
need accessory irrationalities. 


E. Galois’s Strategy. In Section 12.1 we saw that Lagrange formulated his 
strategy for solving equations in terms of resolvents. However, there are groups 
lurking in the background. For instance, our discussion of the quartic used 


yy =XyxXQ+43x4 and t =X +x —2x3—-—X4, 
whose isotropy subgroups are 
H(y1) = (12), (1324)) 3 H(t1) = ((12), (34). 


So the idea of reducing to smaller groups is implicit in what Lagrange was doing. 
Getting smaller groups is the main goal of Galois’s strategy. In the Historical Notes 
to Section 8.3, we gave the following quote where Galois discusses his approach to 
solvability by radicals [Galois, pp. 57-59]: 
I first observe that to solve an equation, it is necessary to reduce its group 
until it contains only a single permutation ... 


Given this, we will try to find the condition satisfied by the group of an 
equation for which it is possible to reduce the group [to a single permutation] by 
adjunction of radical quantities . . . 


In the first sentence, Galois states the goal of reducing the Galois group to the identity, 
and in the second, he says that in the case of solvability by radicals, the goal is to 
reduce the Galois group by adjoining radicals. 

In its most general form, Galois’s strategy is to successively adjoin known quanti- 
ties (radicals or roots of resolvents) in order to reduce the Galois group to the identity. 
This adjunction process gives an extension F C K that we regard as known. The 
splitting field of f over K is easily seen to be K C KL, which is one of the extensions 
in the diagram (12.25). By Theorem 12.2.5, we have 


(12.27) Gal(KL/K) ~ a subgroup of Gal(L/F). 


Thus going from F to K gives a subgroup of the original Galois group. Furthermore, 
if the new Galois group is the identity, then KL = K, which implies that L C K. Since 
K is known, it follows that the roots of f are also known. 


342 LAGRANGE, GALOIS, AND KRONECKER 


Here is an example of how this works. 


Example 12.2.6 One easily checks that f = x? + 9x—2 € Q[x] is irreducible over Q 
with real root given by 


ay = V¥142V74+ V1 —2V7. 


The other roots of f are a complex conjugate pair a2 = 03, since A(f) = —3024 is 
negative. If L is the splitting field of f over Q, then Proposition 7.4.2 implies that 
Gal(L/Q) ~ S53, since WA(f) ¢ Q. 

To make the Galois group smaller, we adjoin 6 = V¥1+2¥7 to Q, which gives 
K = Q(B). In Exercise 7 you will show that ¥/ 1 — 2/7 and hence ay lie in K. This 
means that f factors as (x —a)g, where g € K[x] has roots a2,03. Thus KL is 
obtained from K by adjoining a2, 03. Since K C R and a2,q3 are not real, it follows 
that [KL: K] = 2. This shows that Gal(KL/K) ~ Z/2Z. 

Furthermore, if we think of KL as the splitting field of f over K, then we still have 
the map Gal(KL/K) — S3;. Given how we’ve labeled the roots, the image of this map 
is clearly ((23)) Cc $3. So adjoining 8 reduces the group of permutations from $3 to 
the smaller group ((23)). 

In Exercise 7 you will show that if we adjoin w = e?*'/3 to K, then K’ = K(w) 
contains all roots of f, so that K’L = K’. Hence the Galois group has been reduced 
to Gal(K’L/K’) = {e}, which completes Galois’s strategy. <p> 


To fully understand Galois’s strategy, we need to think in terms of permutations. 
If our separable polynomial f has degree n, then the action of the Galois group on 
the roots gives a map 


(12.28) Gal(L/F) — S, 


whose image is Galois’s group by Theorem 12.2.3. Now let F C K be a finite 
extension. Since K C KL is the splitting field of f over K, we get a similar map 


(12.29) Gal(KL/K) — S,. 


In Exercise 8 you will show that these maps are compatible with the isomorphism 
given in (12.27). Hence, when we regard Gal(KL/K) and Gal(L/F) as subgroups 
of S,,, the former is contained in the latter. 

This makes Galois’s strategy easy to understand. He works with a fixed separable 
polynomial f of degree n. For him, the group of f lies in S,, but the field he works 
over keeps changing. Furthermore, each time he enlarges the field by adjoining 
something known (a radical or a root of a resolvent), he passes from the group to a 
subgroup (which may be the whole group). This leads to extensions 


FCK,CK2C-:: 


where each K; is regarded as consisting of things that are known. If at some point, 
say for K,,, the group reduces to the identity, then Gal(KmL/Km) = {e}, which as 
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noted above implies that L C K». This allows us to express the roots of f in terms of 
known quantities. 


Historical Notes 


When reading Galois, one must keep in mind the distinction between arrangements 
of roots and permutations of roots. If you look back at the quotation giving Galois’s 
definition of his group, you will see that he lists m arrangements of the roots. The 
corresponding permutations come from mapping the first arrangement to the others. 
To complicate matters, Galois uses different terminology from us: 


Us Galois 


Arrangement | Permutation 
Permutation | Substitution 


So when Galois says “group of permutations,” he really means “group of arrange- 
ments.” But later in the memoir, we find the following [Galois, pp. 53-55]: 


It is evident that in the group of permutations considered here, the order of the 
letters is not of importance, but rather only the SUBSTITUTIONS of the letters by 
which one passes from one permutation to another. 


By Theorem 12.2.3, these substitutions form a subgroup of S, isomorphic to the 
Galois group. 

Galois knew the difference between arrangements and substitutions, and was 
aware that the latter formed a group in the modern sense [Galois, p. 47]: 


. if one has substitutions 5 and T within such a group, one is sure to have the 
substitution ST. 


From the modern point of view, substitutions are more important. But this was not 
clear to Galois, especially given the vivid visual image provided by groups of ar- 
rangements. This is evident from Galois’s definition of his group, and other examples 
can be found in [12]. Galois’s memoir is written in terms of arrangements, although 
changes made shortly before his death in 1832 indicate that Galois was thinking 
about switching to substitutions. For example, we quoted Galois’s Proposition I at 
the beginning of the section. This quotation includes an asterisk * that refers to a 
marginal note where Galois says “Put everywhere in place of the word permutation 
the word substitution” [Galois, p. 50]. But then Galois crosses this out! 

It took a while for the mathematical community to understand Galois’s ideas. 
In 1866 the third edition of Serret’s Cours d’algébre supérieure included a partial 
account of Galois theory. As quoted in [11, p. 110], Serret comments that “Galois 
used the notion of groups of permutations [our arrangements] ..., but it seems better 
for us to keep to substitutions.” This quote also shows that “substitution” was the 
common name for elements of S,, in the nineteenth century. Another example of this 
is Jordan’s 1870 text Traité des substitutions et des équations algébriques [Jordan1], 
which gave the first complete account of Galois theory. 

Our discussion of Galois’s strategy did not state his version of Theorem 12.2.5 (the 
Theorem on Natural Irrationalities). The reason is that one needs to understand the 


344 LAGRANGE, GALOIS, AND KRONECKER 


distinction between arrangements and permutations before reading Galois’s version, 
which goes as follows [Galois, p. 55]: 


PROPOSITION II 


THEOREM. If one adjoins to a given equation [our f] the root 7 of an 
auxiliary irreducible equation *, 
1° one of two things will occur: either the group of the equation will not change; 
or it will be partitioned into p groups each belonging to the original equation 
when one adjoins each of the roots of the auxiliary equation; 
2° these groups have the remarkable property, that one passes from one to another 


by applying to all of the permutations of the first the same substitution of the 
letters. 


The asterisk * indicates that r was a root of an auxiliary equation “of prime degree 
p” in an earlier version of Proposition II [Galois, p. 54]. This is the “p” that appears 
in 1°. In Exercise 9 you will use the Galois correspondence and Theorem 12.2.5 
to show that if [K : F] = p, then Gal(KL/K) is isomorphic either to Gal(L/F) or to 
a subgroup of index p in Gal(L/F). The latter corresponds to “partitioned into p 
groups” in the above quotation. 

It appears that Galois first proved Proposition II in the prime-degree case. The 
night before his fatal duel, he realized that his proof applied in greater generality. 
Writing in haste, he changed part but not all of the statement of Proposition II. He 
also knew that his proof was incomplete—this is where he writes “Je n’ai pas le 
temps” (“I don’t have time’’) [Galois, p. 54]. 

This explains Proposition II up to the appearance of p. But what about the 
remainder of 1°? The idea is that instead of adjoining one root r of the auxiliary 
equation, one could adjoin a different root r’ of the same equation. This gives a 
different extension F C K’ = F(r’). Then going from F to K’ will reduce the group, 
but possibly in a different way. In modern terms, F C K and F C K’ are conjugate 
extensions. By Theorem 12.2.5, Gal(KL/K) and Gal(K’L/K’') are isomorphic to 
subgroups of Gal(L/F). Then Galois’s observation in 1° is that these subgroups are 
conjugate in Gal(Z/F). You will prove this in Exercise 10. 

The precise meaning of 2° of Proposition II will be explored in Exercises 11 
and 12. Galois’s proof of Proposition II can be found in [5] and [12]. 

This concludes our discussion of Galois. However, to fully appreciate what 
Galois did, the reader should keep in mind Galois’s other contributions to Galois 
theory, many of which were discussed earlier in the book: 

e Extension fields (Historical Notes to Section 4.1). 

e The Galois correspondence (Historical Notes to Section 7.1). 

e Normal subgroups (Historical Notes to Section 7.2). 

e Solvable groups and solvability by radicals (Historical Notes to Section 8.3). 

e Finite fields (Historical Notes to Section 11.1). 

This is an impressive list for someone who died at age 20. There is also Galois’s 


amazing work on irreducible polynomials of degree p and p”, where p prime. This 
will be described in Chapter 14. 
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For a fuller account of Galois’s mathematical work, the reader should consult [5], 
[11, pp. 80-84], [12], [Edwards], or [Tignol, Ch. 14]. The biography of Galois [13] 
describes his short but intense life. 


Exercises for Section 12.2 


Exercise 1. Let F be an infinite field and let V be a finite-dimensional vector space over F. 
The goal of this exercise is to prove that V cannot be the union of a finite number of proper 
subspaces. This will be used in Exercise 2 to prove the existence of Galois resolvents. 
Let W,..., Wm be proper subspaces of V such that V = W, U---UW,,, where m > | is the 
smallest positive integer for which this is true. We derive a contradiction as follows. 
(a) Explain why there is v € W, \ (W2 U---UW,,). 
(b) There is w € V \W,, since W; is a proper subspace. Using v from part (a), we have 
Av+weEV=W.U---UW, forall A € F. Explain why this implies that there are A; 4 Ax 
in F such that \1v + w, A2v + w € W; for some i. 
(c) Now derive the desired contradiction. 


Exercise 2. Suppose that we have an extension F C L, where F is infinite. The goal of this 
exercise is to show that if a1,...,a@n © L are distinct, then t,...,f, © F can be chosen so that 
the polynomial s(y) defined in (12.21) has distinct roots. Given o #7 in Sp, let 


Wor = {(thy-- este) EF" | Sy (Gow — Or ()ti = O in L. 


(a) Prove that W,,, is a subspace of F” and that W.,, #4 F”. 
(b) Show that part (a) and Exercise | imply that there are t),...,tn € F such that the polynomial 
s(y) from (12.21) has distinct roots. 


Exercise 3. This exercise will prove Galois’s Lemma III using the methods of Lagrange. 
Let V = thay +--+ +tnQn, where t1,...,tn are chosen so that the Galois resolvent s(y) from 
(12.21) is separable. Also let Vo = tiag(1) +++*+tn@o(n) for o € Sn. Prove that each a; can 
be written as a rational function in V with coefficients in F by adapting the second proof of 
Theorem 12.1.6. 


Exercise 4. In the discussion preceding (12.25), we have extensions F C L, which isa splitting 
field of f € F[x], and F C K = F({), where £ is a root of an irreducible polynomial in F [x]. 
Given the many ways in which extension fields can be constructed, these extensions might 
not have much to do with each other. Prove that there is an extension F C M that contains 
subfields F C L; C M and F Cc Ki C M such that L), K; are isomorphic to L, K, respectively, 
where the isomorphisms are the identity on F. Thus, by replacing L, K with the isomorphic 
fields L, Ki, we can assume that L, K lie in a larger field, as claimed in the text. 


Exercise 5. Suppose that F C L is the splitting field of a separable polynomial f € F [x]. Also 
suppose that we have another finite extension F C K such that the compositum KL is defined. 
Prove that K C KL is the splitting field of f over K. 


Exercise 6. This exercise will complete the proof of Theorem 12.2.5. Given o € Gal(KL/K), 
we showed in the text that o|, maps L to L. 

(a) Show that (o7)|, = ol,7 |, 

(b) Use part (a) to show that o~! |, 18 the inverse function of o|,. 

(c) Use part (a) to show that (12.26) is a group homomorphism. 
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(d) Let o be an automorphism of KL that is the identity on both K and L. Prove that o is the 
identity on KL. 


Exercise 7. This exercise is concerned with the details of Example 12.2.6. As in the A 
let L be the splitting field of f = x° +9x—2 over Q and set K = Q(8), where 8 = V14+2V7 

(a) Show that 71 —2V7€ K. 

(b) Show that K’ = K(w), w = e?"/3, contains all roots of f. 


Exercise 8, In Theorem 12.2.5, we have the map (12.26) defined by 7 +> o|,. However, if 
F C Lis the splitting field of a separable polynomial f € F [x] of degree n, then we also have 
maps (12.28) and (12.29). Prove that these maps are compatible, i.e., that o € Gal(KL/K) and 
ol, € Gal(L/F) map to the same element of S,, under (12.28) and (12.29). 


Exercise 9. In the situation of Theorem 12.2.5, suppose that F C K is an extension of prime 
degree p. Prove that Gal(KL/K) is isomorphic to either Gal(L/F) or a subgroup of index p 
in Gal(L/F). 


Exercise 10. Suppose that we have a diagram (12.25) as in Theorem 12.2.5. Also assume 
that K = F(), and let K’ = F(8"), where §’ and @ have the same minimal polynomial over 
F. You will show that Gal(KL/K) and Gal(K’L/K’) give conjugate subgroups of Gal(L/F). 
This is the modern version of what Galois says in 1° of Proposition II. 
(a) Let F CM’ be the Galois closure of the extension F C M constructed in Exercise 4. 
Explain why we can regard L, K, and K’ as subfields of M’. 
(b) Explain why we can find 7 € Gal(M’/F) such that 7(K) = K’. 
(c) Show that r], € Gal(L/F) maps KNL to K/NL. Thus KML and K'ML are conjugate 
subfields of L. 
(d) Use Lemma 7.2.4 to show that in Theorem 12.2.5, Gal(KL/K) and Gal(K'L/K’) map to 
conjugate subgroups of Gal(L/F). 


Exercise 11. Let A denote the set of arrangements described by Galois. This is Galois’s 
“group.” For simplicity, we write the first arrangement on Galois’s list as a1 ---a@,. Then let 
G be the set of permutations that take the first element of A to the others. Theorem 12.2.3 
implies that G is a subgroup of S,, isomorphic to Gal(L/F). 

We also have the action of S, on the set of all n! arrangements of roots by 


o+ Qi, Oi, = Oe (i) t+ Qe(in)- 


This induces an action of G on the set of arrangements. 
(a) Explain why A is the orbit of a; ---a, under the G action. 
(b) Show that the map G > A defined by ¢ +> @- a --- Qn is one-to-one and onto. 


Exercise 12. In the situation of Theorem 12.2.5, let G C S, correspond to Gal(L/F), and 
H CS, correspond to Gal(KL/K). By Exercise 8, we know that H CG. Also let A be 
the set of arrangements studied in Exercise 11. Then a left coset cH C G gives a subset 
oH -a---Q_, C A, and since the map 0 +> @- a1 ---Q iS one-to-one and onto, the sets 
oH -a--- Q@p partition A into disjoint subsets. We claim that these are the “groups” that appear 
in 1° and 2° of Galois’s Proposition II. 

(a) Given any two such “groups” oH -a1--+Q@, and TH -a1--+ Qn, prove that there is y € G 
such that (as Galois says in 2°) one passes from one to the other by applying + to all 
arrangements in the first. 

(b) So far, it seems like Galois is describing cosets. However, as pointed out in [12], Galois 
thought of these “groups” differently. This is seen by explaining how they relate to 1° of 
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Galois’s proposition. Let M’ be the field used in Exercise 10, and let 7 € Gal(M’/F). 
Then K’ = r(K) is a conjugate of K. Let o € G be the permutation corresponding to 
T|, © Gal(L/F). Show that oHa—' is the subgroup of S, corresponding to Gal(K’L/K’). 

(c) Using the setup of part (b), consider the “group” oH -a1---Q_ C A. Prove that oHa~' C 
Sn is the set of all permutations of S, that map the first element of this “group,” namely 
Oa +++, to another element of the “group.” (Remember that this is the process for 
turning a “group” of arrangements into a subgroup of S,.) 

Combining parts (b) and (c), we see that what Galois says in 1° of Proposition II is fully 

consistent with what we did in Exercise 10. 


Exercise 13. This exercise will show that not all choices of the ¢; in (12.21) give Galois 
resolvents. As in Example 12.2.1, f = (x? - 2) (x? — 3) has roots V2, —V2, V3, and —V3. 
This time we will use (¢1,f,¢3,t4) = (0, 1,2,3). Show that (12.21) gives the polynomial 
s(y) = 1679616 — 45722880y" + 445417056 y* — 1935550800 y° 
+ 4169468065 y® — 4504515400 y'” + 2268233020 y'? — 432170200y'* 
+ 36781990 y'> — 1483000y'® + 29596 y° — 280y” + y* 
= (81 —90y* + y*)?(16 — 40y* + y*)°(1 — 10y? +)’. 


This does not have distinct roots, so that s(y) is not a Galois resolvent. 


Exercise 14. Use Theorem 12.2.5 and standard results about Galois extensions to prove that 
|Gal(KL/K)| = [L: KNL]. Then explain why this implies that |Gal(KL/K)| < |Gal(L/F)| if 
and only if F is a proper subfield of KN L. 


Exercise 15. Let F C L and F C K be Galois extensions such that KL is defined. We will also 
assume that KL = F. The goal of this exercise is to prove that F C KL is a Galois extension 
with Galois group 
Gal(KL/F) ~ Gal(L/F) x Gal(K/F). 
(a) Prove that F C KL is Galois and that o € Gal(KL/F) implies that o|, € Gal(L/F) and 
ol, € Gal(L/K). 
(b) Use part (d) of Exercise 6 to show that there is a one-to-one group homomorphism 


Gal(KL/F) —> Gal(L/F) x Gal(K/F). 


(c) Use Exercise 14 and the Tower Theorem to show that [KL : F] = [K: F][L: F]. 
(d) Conclude that the map of part (b) is an isomorphism. 


12.3 KRONECKER 


In this section we will explore how Kronecker combined ideas of Lagrange, Gauss, 
and Galois to give a powerful construction of the splitting field of a separable poly- 
nomial over a field of characteristic 0. 


A. Algebraic Quantities. In 1882 Kronecker published the important paper 
Grundziige einer arithmetischen Theorie der algebraischen GrOssen [Kronecker, 
Vol. Il, pp. 237-387]. In English, the title is “Foundations of an arithmetic theory 
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of algebraic quantities,’ which signals Kronecker’s intention to create a general 
foundation for dealing with algebraic objects. 

Kronecker begins his Grundziige by describing the fields that he will work over, 
although he deliberately avoids using Dedekind’s terminology of “fields.” The quota- 
tion from Dedekind given in the Historical Notes to Section 4.1 shows that Dedekind’s 
definition is very abstract: anything that satisfies the field axioms is a field. Kronecker, 
on the other hand, wants to emphasize that the objects he deals with are very con- 
crete. We will use the term “field” when discussing what Kronecker does, though 
Kronecker would not be entirely comfortable with this practice. 

Kronecker’s basic objects of study are elements of a Rationalitats-Bereich (domain 
of rationality). Such a domain is built out of finitely many algebraischen Grossen 
(algebraic quantities) R’, RK” ,R'",..., which can be variables or roots of polynomials 
(we will say more about this below). Then an element of the Rationalitats-Bereich is 
a rational function with integer coefficients in these quantities. In modern terms, this 
is the field 


(12.30) L= QR’ ”,...), 


since by clearing denominators, every element of L can be written as a quotient of 
polynomials with integer coefficients. For Kronecker, however, the emphasis is more 
on the elements than on the field. 

The basic operation on such fields is adjunction. Given L as in (12.30), consider 
an irreducible polynomial with coefficients in L. Then a root of this polynomial is 
an algebraic quantity 6 that gives a new Rationalitats-Bereich when adjoined to L. 
In §2 of the Grundziige, Kronecker assumes without comment that 6 exists. As we 
will see, he eventually explains why this assumption is valid. 

Here is a simple example of Kronecker’s adjunction process. 


Example 12.3.1 Given a variable x, the field Q(x) is an example of a Rationalitats- 
Bereich. Furthermore, in Exercise ] you will verify that y? — 4x? — x is irreducible 
as a polynomial in y with coefficients in Q(x). Thus, by adjoining a root y; of this 
polynomial to Q(x), we get a new Rationalitats-Bereich 


Q(x,y1) = Q(x, V 4x3 +x). 
In particular, y) = V4x3 + x is an algebraic quantity. <p> 


Kronecker then studies the structure of the fields (12.30). His main result is that 
any such field can be written as an extension 


(12.31) Qc Q(R;, Ro, Rs,...) C Q(G, Ry, Ro, Mz,..-) 


where 911, Ro, 9t3,... can be regarded as variables over Q, and © is algebraic over 
Q(R;,R2,R3,...). In Exercise 2 you will show that this follows from the result 
of Steinitz discussed in the Mathematical Notes to Section 4.1, together with the 
Theorem of the Primitive Element. 
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In contrast to modern presentations, the fields considered by Kronecker are con- 
structed explicitly. This constructive attitude runs very deep. For example, rather 
than just defining what it means for a polynomial to be irreducible, Kronecker gives 
a method for deciding whether or not a polynomial with coefficients in a field of 
the form (12.31) is irreducible. To do this, he first discusses polynomials in Q{x] 
and describes the algorithm presented in Proposition 4.2.2 for factoring polynomials 
over Q. He then gives a terse explanation of how to factor in the general case. The 
missing details can be found in Edwards’s book [7]. 


B. Module Systems. Besides developing a theory of fields, the Grundziige 
also considers rings, ideals, and quotient rings. This begins in §5 of the Grundziige, 
where Kronecker introduces the Integritats-Bereich (domain of integrality) built from 
RK’ RK" R’”,.... In modern terms, this is the integral domain 


R=ZRR"R",...] 


consisting of all polynomials in RW’, RR”... with integer coefficients. The field 
of fractions of R is the Rationalitats-Bereich Q(R’,R” ,R'",...). 

The next step is to define certain ideals of R. In §21 of the Grundziige, Kro- 
necker takes finitely many elements M,,M2,M3,... € R and defines the module system 
(M,,M2,Ms3,...) to consist of all linear combinations with coefficients in R 


AM, + A2Mz+A3M34+---, Aj1,A2,A3,...ER. 
Given M,M’ © R, Kronecker then defines 
(12.32) M=M' (modd. M,,M2,M3,...) 


to mean that M — M’ is contained in (M,,M2,M3,...). These days, we say “ideal” 
rather than “module system” and we write the ideal as 


T= (M,,M2,M3,...) = {AiM, + A2M2 + A3M3 +--+ | A1,A2,A3,.-. ER}. 


(In Exercise 3 you will prove that this is an ideal of R.) Then (12.32) means that 
M — M' €1, which is equivalent to the equality M +7 = M' + / of cosets in R/T. 

It follows that Kronecker is developing the basic language of ideals and quotient 
rings. However, Kronecker didn’t use Dedekind’s term “ideal,” because Dedekind 
allowed his ideals to be very abstract, while Kronecker was only interested in the 
explicitly constructed ideals described above. 

For us, the most important application of these ideas came in Section 3.1. Recall 
that in the proof of Proposition 3.1.3, we showed that if f € F |x] is an irreducible 
polynomial, then L = F[x]/(f) is an extension field of F such that a =x+(f) € Lis 
aroot of f. This is how we proved the existence of roots. 

This is equally important for Kronecker, for he uses a similar construction to give 
a precise meaning to the term “algebraic quantity.” The idea is as follows: In (12.31), 
let G(x) be the minimal polynomial of 6 over Q(R),R2,R3,...). Then one can 
replace the extension field Q(6, 1, R2, N3,...) with the quotient ring 


(12.33) Q(R1, Re, Kz...) [x]/(G), 
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where the coset x+ (G) plays the role of the root 6. Since Ry), Hy,%3,... are 
variables in (12.31), we now have a rigorous construction of the algebraic quantity 
© in terms of polynomials and ideals. 

In this construction, Kronecker preferred to work with rings rather than fields 
since he wanted to avoid denominators as much as possible. So Kronecker would 
replace (12.33) with 

ZR, P42, Rs, eee x] /(G), 


where one has now suitably cleared denominators so that G is an irreducible polyno- 
mial in Z[Rj, Az, 93,...,x]. This quotient ring is an integral domain whose field of 
fractions is the corresponding field (12.33). 

Here is a simple example of what these presentations look like. 


Example 12.3.2 Consider the field Q(x, V 4x3 + x) constructed in Example 12.3.1. 
This is Q C Q(x) C Q(x, V4x3 +x). Asa polynomial in y, the minimal polynomial 
of V4x3 +x is y? — 4x3 — x, so that (12.33) becomes 


Q(x)[y]/(y? — 4x° —x). 


Kronecker’s presentation, which uses Z rather than Q, would be to take the field of 
fractions of the integral domain 


Z{x,y]/(y? — 4x? —x). 
Notice how polynomials in several variables appear naturally in this example. <> 


As noted in [7], Kronecker was aware that this construction allows one to dispense 
with “algebraic quantities.” Kronecker states this as follows: 
... the whole arithmetic theory of algebraic quantities can be reduced to a theory 
of entire functions of variables and unknowns with integer coefficients ... 
(see [Kronecker, Vol. I], p. 377]). In the nineteenth century, “entire function” meant 
polynomial. Thus Kronecker is saying that we can construct all algebraic quantities 
using congruences of polynomials in several variables with coefficients in Z. 


C. Splitting Fields. One of the points made in [8] is that Kronecker’s conception 
of algebraic quantity evolved during the writing of the Grundziige. The early sections 
of the Grundziige don’t give a precise definition of algebraic quantity, yet the later 
sections provide the language needed for this purpose (as noted in the above quota- 
tion). But to rewrite the Grundzige from this new point of view would have been an 
overwhelming task. Hence one needs to look at Kronecker’s subsequent papers to 
see how he worked out these ideas. 

For us, Kronecker’s 1887 paper Ein Fundamentalsatz der allgemeinen Arithmetik 
[Kronecker, Vol. III, pp. 209-240] is the most relevant. Here, he uses his new methods 
to give an explicit construction of splitting fields. His presentation and proof involve 
a deep understanding of the ideas of Lagrange, Gauss, and Galois. 

Let F bea field of characteristic 0, and let 


(12.34) f(x) =x" — yx"! eax”? +--+ (—1)"en € Fla] 
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be a separable polynomial. In his Fundamentalsatz paper, Kronecker uses F = 
Q(R’,R",...,A"-!), where n > 1 and W’,R",...,R"—! are variables. So for 
Kronecker, the coefficients are explicitly known objects. His goal was to describe 
the roots of f using module systems and congruences. 

Kronecker was inspired by Galois’s approach to the Galois group. Galois assumed 
that the roots @,...,@, Of f lie in some extension of F. Let us recall some of the 
ideas developed in Section 12.2. The Galois resolvent (12.21) is the polynomial 


s(y) = I] (y- (ta(1) tee + tnQo(n))), 
oéS, 
where f),...,f, € F are chosen so that s(y) is separable. Then the key player for 
Galois is the irreducible factor A(y) of s(y) that vanishes at V = thay +--+ +tyQn. 
Recall from Section 12.2 that by Galois’s Lemma III, V is a primitive element of 
the splitting field of f over F. Thus each root aq; is a polynomial y;(V) in V with 
coefficients in F. 
This implies that in the splitting field F(V), we can write 
n n 
f(x) = ]] &- ai) =T] @- iV). 
i=l i=l 
Furthermore, since h(y) is the minimal polynomial of V over F, we can replace F (V) 
with the quotient ring Fy]/(h), where the coset y+ (A) plays the role of V. If we 
substitute this into the above equation and use congruence notation =, then we can 
write the above factorization as 
n 
(12.35) f(x) =] — 210) mod nip). 
i=1 
This is close to what Kronecker states in his Fundamentalsatz paper. However, our 
derivation of (12.35) assumed knowledge of the roots a1,...,a@, of f. Kronecker’s 
goal is to compute this factorization without knowing the roots of the polynomial in 
advance. 

How does Kronecker accomplish this? In reading Gauss’s 1815 proof of the 
Fundamental Theorem of Algebra (discussed in the Historical Notes to Section 3.2), 
Kronecker learned the strategy of applying Lagrange’s methods in the universal case 
and then specializing to the specific polynomial at hand. This works as follows. 

In the universal situation, the variables x,,...,x,, are roots of the universal poly- 
nomial of degree n, 


f= (x— x) 0+ (x— xy) =x" -— ox"! 4--- + (-1)"on, 


which has coefficients in F(o,...,0,). Then the substitution 0; +> c; takes f to 
f =x" yx"! 4+ (-1)"ep. 

Since F has characteristic 0, we can pick distinct integers t),...,t, € ZC F. This 
gives the universal Galois resolvent 


(12.36) S(y) = I] (y- (tiXe(1) + -+++ tnXo(n))) = I] (y—a- 8B), 


oéS, oESn 
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where 8 = tx, +---+%;,X,. The theory of symmetric polynomials from Chapter 2 
shows that S(y) lies in Q[o1,...,0n,y]. 

Now let s(y) € F[y] be the polynomial obtained from S(y) by the substitution 
o;++c;. Kronecker then claims that t),...,t, € Z can be chosen so that s(y) is 
separable. Once this is done, he will have constructed a Galois resolvent of f without 
knowing the roots of f. Exercises 1 and 2 of Section 12.2 proved the existence 
of t),...,t, in F (rather than in Z) using the roots a1,...,a@,. Of course, Kronecker 
needs to use a different argument. Exercises 4 and 5 will show how to find the desired 
t,...,tn € Z without using the roots. 

In Section 3.2, we observed that Gauss used a similar method to compute the 
polynomial Z(x,u) = []) <j<jen(*— (ai + @;)u + aj02;) from (3.13) without knowing 
the roots of f. Do you see how this could have influenced Kronecker? 

Given the Galois resolvent s(y) € F[y], Kronecker then factors s(y) into a product 
of irreducible polynomials using the methods mentioned earlier in the section. Let 
h(y) be one of the irreducible factors. This gives the polynomial h(y) we need for 
(12.35). The important thing for Kronecker is that h(y) was constructed without 
knowing the roots of f. 

It remains to find explicit formulas for the polynomials y;{y) in (12.35). For this 
purpose, we turn to the methods of Lagrange discussed in Section 12.1. Since S(y) is 
the resolvent polynomial of 8 = t,x; + ---+4t,X,, one can find explicitly computable 
polynomials ©;(y) in F[o1,...,0n,y] such that 


(12.37) y= 


where A(S) is the discriminant of S. You will prove this in Exercises 6-8. 

The substitution o; ++ c; maps S(y) to s(y) and A(S) to A(s). Furthermore, 
A(s) #0, since s(y) is separable. Thus W;(y)/A(S) maps to some polynomial 
yi(y) € Fly]. We can now state the construction of the splitting field of f given in 
Kronecker’s Fundamentalsatz paper (Kronecker, Vol. II, p. 216]. 


Theorem 12.3.3 Let F have characteristic 0. Let f € F |x] be monic and separable 
of degree n > 0, and let s(y) and y;(y), i= 1,...,n, be constructed as above. Then, 
for any irreducible factor h(y) € F |x] of s(y), we have 


F(x) = (x—-1(y)) +: (x Gay) mod Aly). 


This congruence means that each side is a polynomial in x whose coefficients are 
equal in the quotient ring F |y]/(h(y)). Furthermore, F[y]/(h(y)) is a splitting field 
of f over F. a 


We will not prove this here. The reader should consult [7] for a modern (but fully 
constructive) proof of Theorem 12.3.3. 

Kronecker remarks that because of Theorem 12.3.3, “one is then relieved of the 
introduction of algebraic quantities in many .. . algebraic investigations” [Kronecker, 
Vol. III, p. 216]. This is his clearest statement of how to avoid algebraic quantities. 
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Kronecker’s choice of the word Fundamentalsatz (“Fundamental Theorem”) in the 
title of his paper indicates the importance he attaches to this result. 

Theorems 12.3.3 and 3.1.4 both prove that a splitting field of f € Fx] exists, 
though Kronecker’s theorem differs from Theorem 3.1.4 in two ways: 


e Rather than construct the splitting field L using a sequence of quotient rings (as 
in Theorem 3.1.4), Kronecker constructs all roots of f simultaneously using just 
one quotient ring L = Fly]/(h(y)). 

e We will see in Chapter 13 that Kronecker’s construction leads directly to an 
algorithm for computing the Galois group of /. 


Hence Kronecker’s construction of the splitting field contains a lot of information 
about the roots of the polynomial. It is harder than what we did in Theorem 3.1.4, 
but for a good reason. 

Theorem 12.3.3 uses a lot of mathematics, including ideas of Lagrange (the uni- 
versal case), Gauss (relating the universal to the specific), and Galois (the Galois 
resolvent). It is impressive how Kronecker was able to synthesize all of this mathe- 
matics into one theorem. One irony is that while Kronecker is given credit as the first 
to prove the existence of the roots of a polynomial, his version of this result is rarely 
mentioned, since most books use the proof of Theorem 3.1.4 given in Chapter 3. 
While the modern proof illustrates the power of abstract algebra, it does not reflect 
the richness of the historical context that led to Kronecker’s proof of the existence of 
splitting fields. 


Historical Notes 


Congruences modulo a polynomial or module system play a central role in Kro- 
necker’s construction of algebraic quantities. When Kronecker wrote his Grundziige 
in 1882, there were many known examples of congruences, including: 


e a=b mod n, fora,b € Z (Gauss, 1801). 

e o(x) = x(x) mod x? + 1, for (x), x(x) € R[x] (Cauchy, 1845). 

e (a) =y¥(a) mod pZ[al, for y(a), (a) € Z[a] (Schénemann, 1846). 
e P(x) = Q(x) mod (p, f(x)), for P(x), Q(x) € Z[x] (Dedekind, 1857). 


While Kronecker did not originate the use of congruences in the polynomial setting, 
he was clearly the first to realize the full power of this construction. 

Our discussion of Kronecker’s work omitted several important topics covered in 
the Grundziige. For example, Kronecker created a theory of divisors to generalize 
Kummer’s ideal numbers. This differs from Dedekind’s theory of ideals, which is 
another way to generalize ideal numbers. An exposition of divisor theory can be 
found in [6]. Kronecker also considered discriminants in detail. 

We should also note that our discussion made liberal use of set theory. For 
example, when we gave Kronecker’s definition of module system (M,,M2,...) ina 
ring R, we immediately translated it into the ideal 


(M,,Mb,...) = {AM +A2M2 +--+ |A,,A2,... ER}. 
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Kronecker, in contrast, says that A;M,; + A,M2+--- “contains the module system 
(M),Mp,...)” [Kronecker, Vol. II, p. 335]. To us, this seems backwards. As explained 
to me by Edwards, Kronecker’s use of “contains” is similar to saying that 6 “contains” 
2 as a divisor. This makes sense when one realizes the importance of divisors in 
Kronecker’s mathematical thought. An introduction to Kronecker’s views of set 
theory and the foundations of mathematics can be found in [9]. 

An important development in the late nineteenth century was the realization that 
one needs to study the foundations of mathematics. It was no longer sufficient to 
simply assume the existence of mathematical objects such as algebraic quantities. 
Rather, one had to give a rigorous proof of their existence. But instead of Kronecker’s 
constructive vision of what this meant, the set-theoretic approach of Dedekind and 
Cantor came to dominate. This explains why modern mathematics is firmly based 
on sets and why abstract algebra is so different from high school algebra. 

This chapter began with Lagrange’s attempts to understand the roots of polynomi- 
als and ended with Kronecker’s impressive construction of splitting fields. Along the 
way, we were able to witness the brilliance of Galois and the beginnings of modern 
algebra. It has been a remarkable odyssey. 


Exercises for Section 12.3 


Exercise 1. Prove that y* — 4x° — x is irreducible when considered as an element of Q(x) [y]. 


Exercise 2. Show that (12.31) follows from the Theorem of the Primitive Element and the 
theorem of Steinitz mentioned in the Mathematical Notes to Section 4.1. 


Exercise 3. Let R be a commutative ring and let M/,,...,M; be elements of R. Prove that the 
set (Mi,...,Ms) = { 57)_, 4iM; | Ai € R} is an ideal of R. 


Exercise 4. In the discussion leading up to Theorem 12.3.3, we have the polynomial S(y) € 
Floi,...,0n,y] defined in (12.36). Then s(y) € F[}] is obtained by 0; +> ci, where c; is as 
in (12.34). Both of these polynomials depend on t),...,t,. The goal of this exercise is to 
show that if f is separable, then A(s) is a nonzero polynomial when t1,...,fn are regarded as 
variables. Since F has characteristic 0, part (a) of Exercise 5 implies that A(s) 4 0 for some 
t,...,;%m EZ. 

To prove that A(s) is a nonzero polynomial in f,...,fn, let F C L be the splitting field of 
f constructed in Theorem 3.1.4. Thus f = (x— a1) ---(x— aq) in L[x]. 

(a) If we regard the #; as variables, explain why S(y) becomes a polynomial in y with 
coefficients in F[o1,...,On,t1,--+,tn]. Conclude that s(y) € F[ti,...,t.,y] and hence that 
A(s) € Fltt,...,tal- 

(b) Explain why s(y) = TT oes, (y— (raeciy +++ + tr@o(ny)) in L[fy,... stay]. 

(c) Use part (b) and the separability of f to show that s(y) has distinct roots, all of which lie 
in L[t,...,tn]. Conclude that A(s) is a nonzero element of F[h,..., fn]. 


Exercise 5. Let F be a field, and let g € F[t1,...,tn] be nonzero. 
(a) Suppose that F has characteristic 0, so that Q C F. For each i, pick a nonnegative integer 
N; such that the highest power of t; appearing in g is at most NV, and let 


A= {(a1,...,@n) | a; € Z, 0<ai <N}. 


Prove that there is (a1,...,@n) € A such that g(a1,...,@n) #0. 


KRONECKER 355 


(b) Now suppose that F has characteristic p and is infinite. Modify the argument of part (a) 
to show that there are a),...,@, € F such that g(ai,...,dn) #0. 


(c) Give an example to illustrate why the hypothesis ‘“‘F is infinite” is needed in part (b). 


Exercise 6. In F'[x1,...,xn|, consider the polynomial 


f=(x—m1)+:-(x—a8) =x" — ox” | $e. +(-1)"o 


As noted in Section 2.2, we can regard f € F [o1,...,n] as the universal polynomial of degree 
n. The goal of this exercise is to show that if fi denotes the derivative of f, then there are 
polynomials A,B € F[a1,...,0n,x] such that deg(A) < n—2, deg(B) < n—1, and 


Af+Bf =A. 
Here, A is the discriminant defined in Section 2.4. The proof given here is taken from Gauss’s 


1815 proof of the Fundamental Theorem of Algebra (see [14, pp. 293-295]). 
(a) Show that 


A(x — x2) +++ (x— Xn) A(x— x1) (x — 23) +++ (x —4n) 
(x1 — x2)? +++ (1 — Hn)? — (2 — 41)? (2 — 43)? ++ (82 — Hn)? 


is a polynomial in x of degree at most n — | whose coefficients are symmetric polynomials 
in xt,...,%n. Conclude that B € F[lou,...,on,x]. 


(b) Prove that A — Bf’ vanishes when x = x;. 
(c) Conclude that A — Bf’ is divisible by f, and set 
_ RF 
q-4o8f 
f 


Show that A and B have the desired properties. 


Exercise 7. Let f € F[x] be monic of degree n > 0 with discriminant A(f) € F. Use 
Exercise 6 to show that there are A, B € F[x| with deg(A) < n— 2, deg(B) <n-—1 such that 
the coefficients of A and B are polynomials in the coefficients of f and A f+Bf’ = A(f). 


Exercise 8. This exercise is concerned with Y;(y) from (12.37). Let S(y) be as in (12.36). 


(a) Show that applying (12.5) and (12.8) from the proof of Theorem 12.1.6 with f = 8 = 
tyxp tees +tpX_ and R=Xi gives 
_ (8) 


w 98)’ 


S(y) x(i) 
G(y)= ) ——_... 
y y-a-B 


Also prove that &;(y) € F[ai,...,0n,y]- 


(b) Use Exercise 7 to show that there are polynomials A,B € F[oi,...,0n,¥| such that 
A(y)S(y) + B(y)S’(y) = A(S). Also show that B(8)S’(8) = A(S). 
(c) Use part (b) to show that (12.37) holds with U;(y) = B(y)®,(y). 


where 
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CHAPTER 13 


COMPUTING GALOIS GROUPS 


Galois groups are not easy to compute. As Galois says in the “Discours préliminaire” 
to his first memoir on Galois theory [Galois, p. 39]: 

If now you give me an equation that you have chosen at will, and about which 

you want to know if it is or is not solvable by radicals, I cannot do any more 

than indicate the means for answering your question, without wanting to charge 

either myself or any other person with doing it. In a word, the calculations are 

impractical. 
Even with the aid of modern computers, it is not easy to compute the Galois group 
of a polynomial of large degree (currently 50 or higher) unless the polynomial has 
some special structure. 

This chapter will explore some (but not all) ways of computing Galois groups of 

arbitrary polynomials; Chapters 14 and 15 will describe special classes of polynomials 
for which it is possible to say more about the Galois group. 


13.1 QUARTIC POLYNOMIALS 


In Section 7.4 we explained how to compute the Galois group of a monic irreducible 
separable cubic polynomial f over a field F of characteristic different from 2. Recall 
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that up to isomorphism the only possibilities for the Galois group are Z/3Z and $3, 
and that these cases are distinguished by whether or not the discriminant A(f) is the 
square of an element of F. 

In this section we will prove a similar result for a monic irreducible quartic 
polynomial f € F[x], where F has characteristic 4 2. Note that f is necessarily 
separable by Lemma 5.3.5. We will write f in the form 


(13.1) f =xt—ex? +02x* — 03x +4, €1,€2,€3,04 € F. 
In computing the Galois group of f, the key players will be the discriminant of f 


A(f) = 144 c2¢7c3 + 18c\c3e2 — 192cye3c3 - 6c7chcq 
+ 144 cqchcn — 4 c3etcy + che7c3 + 256c} 
— 27¢4 + 18 chescacg — 4c}c3 — 128.33 


+ 16c3c4 — 4c3c3 — 27 cfc4 — 80 cyc303c4, 


(13.2) 


computed by the methods of Section 5.3, and the Ferrari resolvent of f 
(13.3) 6(y) = yey t+ (cye3 —4.c4) y — €3 — e824 + 4.02€4. 


Below we explain how (13.3) relates to the theory developed in Section 12.1. 
The Galois group of f is Gal(L/F), where L is a splitting field of f over F. 
Proposition 6.3.1 implies that there is a subgroup G C Sq such that 


(13.4) Gal(L/F) ~GC Sq. 


Since we only need Gal(L/F) up to isomorphism, we can focus on the subgroup G. 
But G depends on how we label the roots. In Exercise 1 you will show that if we 
change the labels, then G gets replaced by a conjugate subgroup in S4. Since the 
roots are intrinsically unlabeled, our goal is to compute G up to conjugacy. 


Theorem 13.1.1 Let F have characteristic # 2, and f € F |x] be monic and irre- 
ducible of degree 4. Write f as in (13.1), and let A(f) and 6,(y) be defined as in 
(13.2) and (13.3). Then the subgroup G C S4 from (13.4) is determined as follows: 
(a) If @(y) is irreducible over F, then 


G= S4, if A(f) ¢ F?, 
Ag, if A(f) € F?, 


(b) If @¢(y) splits completely over F, then 
G = ((12)(34), (13)(24)) ~ Z/2Z x Z/2Z. 


Furthermore, 0 ¢(y) splits completely over F if and only if it is reducible over F 
and A(f) € F?. 
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(c) If Os(y) has a unique root B in F, then 


((1324),(12)) ~Ds, if48+c?-—4c2 #0 and 
A(f) (46 +c} —4c2) ¢ (F*)’, 
G is conjugate to or 48 +c? — 4cy = 0 and 
A(f)(8? — 4ca) ¢ (F*)’, 
((1324)) ~Z/4Z, otherwise, 


where Dg is the dihedral group of order 8. Furthermore, 0(y) has a unique root 
in F if and only if it is reducible over F and A(f) ¢ F?. 


Proof: In Section 12.1, we defined the universal Ferrari resolvent to be 


O(y) = (y — (x1 x2 +.x3x4)) (y — (103 + x2%4)) (y — (x14 +0243) 


=y~ ony’ + (o\03 — 404) y — ao - ara, + 40204. 


If f has roots a;,a2,03,a4 € L, then the evaluation map x; +> a; takes a; H c; and 
hence takes 9(y) to the Ferrari resolvent 0¢(y) of f defined in (13.3). It follows that 
the roots of @¢(y) are 


(13.5) 0102 +0304, O103+0204, 0104+ 0203. 


In particular, 6¢(y) splits completely in L. 

Using this, we can now prove part (a). Since f and 0¢ are monic and irreducible 
over F,, they are the minimal polynomials over F of a; and aj a2 +0304, respectively. 
By the Tower Theorem, we see that [L: F] is divisible by 12, so that |G| = |Gal(L/F)| 
is also divisible by 12, since F C L is Galois. In Exercise 2 you will show that A, 
is the only subgroup of S4 with 12 elements. Thus the hypothesis of part (a) implies 
that G = A, or Sy. Then we are done by Theorem 7.4.1. 

Before proving parts (b) and (c), we first observe that 6, and f have the same 
discriminant, i.e., 


(13.6) A(6,) = A(f). 


In the universal case, this was proved in Exercise 9 of Section 2.4. In Exercise 3 you 
will explain why this implies (13.6). Since f is separable, we conclude that 6, is a 
separable cubic. 

Now suppose that 6, is reducible over F. Since 6 is a cubic, this implies that it 
has a root 8 € F. By (13.5), we may relabel the roots of f so that 


(13.7) B=aja2+aj;04€ F. 


As explained earlier, relabeling the roots of f replaces G with a conjugate subgroup. 

We will analyze how (13.7) affects the Galois group. What follows is a special 
case of a general phenomenon that will play a central role in Sections 13.2 and 13.3. 
We claim that (13.7) implies that 


GC ((1324), (12)) ~ Dg. 
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The rough idea is that the Galois group shrinks when a resolvent has a root in F. 
To prove our claim, suppose that ¢ € Gal(L/F) corresponds to rT € GC S4. Since 
8 € F, one easily computes that 


B = 0(B) = o( aa + 0304) = 07 (1) Oy (2) + Or (3) (4) 
a0) +0304, if r € (1324), (12)), 
= ¢ a103+ a204, if 7 € (23){(1324), (12)), 
a104+0203, if r € (24)((1324),(12)), 


where 
S4= ((1324), (12)) U (23){(1324), (12)) U(24)((1324), (12)) 


is the decomposition of S4 into left cosets of ((1324),(12)). Since 0 is separable, 
this implies that G C ((1324), (12)) as claimed. 

As in part (a), we know that 4 divides |G|, since f is irreducible. It follows that 
|G| = 4 or 8. Furthermore, we found all subgroups of Dg when we worked out the 
Galois correspondence for Q C Q(i, V2) in Section 7.3. In Exercise 4 you will use 
this to show that G is one of the four groups 


(13.8) ((12), (34)), ((12)(34), (13)(24)), ((1324)), (1324), (12)). 


Since f is irreducible, Proposition 6.3.7 implies that G is a transitive subgroup of S4 
(reread Section 6.3 if you’ve forgotten what transitive means). The first group listed 
in (13.8) is not transitive, so that G is one of the remaining three groups. Parts (b) 
and (c) of the theorem describe how we distinguish among these possibilities. 

We begin with part (b). Since F has characteristic # 2, Exercise 5 implies that a 
monic reducible cubic g € F[x] splits completely over F if and only if A(g) € F?. 
By (13.6), we conclude that if 7 is reducible over F, then it splits completely over F 
if and only if A(f) € F?. This proves the final assertion of part (b). Also, when 6 
splits completely over F, Theorem 7.4.1 and A(f) € F? imply that G C Ay. Of the 
groups in (13.8), only ((12)(34), (13)(24)) ~ Z/2Z x Z/2Z lies in Ay. This proves 
part (b). No conjugacy is needed, since ((12)(34), (13)(24)) is normal in S4. 

The final assertion of part (c) follows from the final assertion of part (b). Now 
suppose that 6 € F is a root of 6 and that A(f) ¢ F?. The last condition implies 
that G ¢ Aq, so G must be one of the last two groups of (13.8). Our method for 
distinguishing these begins with Euler’s formula (12.17) for the roots of the universal 
quartic f. This formula involves the square root 


V4yitof—4o2, yi = x42 +4344, 


which is related to the roots x1,x2,%3,.x4 of f via 


Ay, +o? —40,= (x1 + x2 —x3— x4)? 
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(see the discussion leading up to (12.12)). If we apply the evaluation map x; > a; 
and use (13.7), then we obtain 


(13.9) 4B +? — 4c) = (ay +02 — 03 — a4)’. 
It follows that 
(13.10) A(f)(48 +0? — 4c2) = VA(f)- (a1 +02 — 03 — a4) EL. 


Now suppose that 48 +c? — 4c £0, ie., 48 +c? —4c2 € F*. If G = ((1324)), 
then Gal(L/F) has a generator o that maps to (1324). One easily computes that 


a(VA(f)) =-VA(f) and o(a) + a2 — a3 — a4) = —(1 + 2 — 3 — 4). 


It follows that o fixes (13.10), and since o generates the Galois group, we conclude 
that ,/A(f)(48 +c? — 4c2) € F. Thus A(f)(48 +c? — 4c) € (F*)?. 


On the other hand, if G = ((1324),(12)), then some o € Gal(L/F) maps to (12). 
For this 0, we have 


o(/A(f)) =-VA(f) and o(a1 +a2 — a3 — a4) = ay + 2 — 03-4. 


Hence o takes (13.10) to its negative. Since F has characteristic # 2 and (13.10) is 
nonzero, we have ,/A(f)(48 +c? — 4c2) ¢ F. Thus A(f)(48 +c? — 4c2) ¢ (F*)?. 


The above argument fails when 48 + c? — 4c2 = 0 (be sure you see why). In this 
case, we will use 


(13.11) B? — Aca = (a 02 + 0304)? — 4011020304 = (0102 — 0304)’. 


In Exercise 6 you will show that 48 + c? — 4c. = O implies that 8? — 4c4 € F*. Then, 
arguing as above, one easily sees that 


A(f)(B? — 4ca) = VA(F) (102 — a304) ¢ F* 
if and only if G = ((1324),(12)) (see Exercise 6). This completes the proof. s 
We now give some examples of Theorem 13.1.1. 


Example 13.1.2 Consider f = x4 — 4x? +x+ 1 € Q[{x]. One can show that f is 
irreducible of discriminant A(f) = 1957 = 19-103 and that its resolvent 


O¢(y) =y? +4y? —4y—17 


is irreducible over Q. By Theorem 13.1.1, the Galois group of f is S4, so that the 

splitting field of f has degree 24 over Q. This has the following consequences: 

e In Example 8.6.7 of Section 8.5, we used f as an example of an irreducible 
polynomial of degree 4 whose roots are all real yet cannot be expressed by real 
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radicals. This follows from Theorem 8.6.5, since the degree of the splitting field 
over Q is not a power of 2. 

e@ In Example 10.1.13 of Section 10.1, we used f as an example of a polynomial of 
degree 4 whose roots are not constructible. This follows from Theorem 10.1.12, 
since the degree of the splitting field over Q is not a power of 2. <> 


In Exercises 7 and 8 you will apply Theorem 13.1.1 to other quartic polynomials 
from earlier in the text. 
Our next example is taken from [Chebotarev, p. 253]. 


Example 13.1.3 Suppose that f = x* + ax? + bx? +ax+1 € F[x] is irreducible, 
where F has characteristic # 2. In this case, the resolvent is 


6;(y) = y? — by’ + (a? —4)y— 2a? + 4b 
= (y—2)(y? + (2—b)y +a’ — 2b), 


and the discriminant is 
A(f) = (4b — a? — 8)?(b — 2a + 2)(b+ 244 2). 


By Theorem 13.1.1, it follows that the Galois group of f is Z/2Z x Z/2Z if and only 
if (b—2a+2)(b+2a+2) = (b+2)* — 4a? € F?. 

The above factorization of 6,(y) is easy to find using Maple or Mathematica. 
In Exercise 9 you will show that the factor y — 2 of 0;(y) arises naturally from the 
symmetry of f. <—P 


Here is an example taken from [18] that illustrates part (c) of Theorem 13.1.1. 


Example 13.1.4 Assume that f = x* + bx? +d € F|x] is irreducible, where F has 
characteristic 4 2. Also assume that d ¢ F*. In Exercise 10 you will show that f has 
discriminant 

A(f) = 16d(b? — 4d)" 


and resolvent 
8;(y) =y? — by” — 4dy + 4bd = (y—b)(y? — 4d). 


Since d ¢ F? and 6,(y) is reducible, part (c) of Theorem 13.1.1 applies. Using 
48 +c? —4c. = 4b+ 0? — 4b = 0 and 8? — 4c4 = b* — 4d, we see that the Galois 
group of f is Dg if d(b? — 4d) ¢ F* and Z/4Z otherwise. 

See [18] for an analysis of what happens when d € F?. <p> 


In Section 13.3, we will give a version of Theorem 13.1.1 that works for all fields, 
not just those of characteristic #4 2. To prepare for this, we need the criterion from 
(18] for distinguishing between Z/4Z and Dg in part (c) of Theorem 13.1.1. 
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Proposition 13.1.5 As in Theorem 13.1.1, let f =x*—c, x9 + cx? —e3x+0e4 € Fle] 
and assume that F has characteristic #2. Also assume that the Ferrari resolvent 
6;(y) has a unique root B € F. Then the Galois group of f is isomorphic to either 
Z/4Z or Dg, and the former occurs if and only if (y — e1y +¢2 — B)(y? — By +¢4) 
splits completely over F (,/A(f)). 


Proof: We assume the same setup as the proof of part (c) of Theorem 13.1.1. We 
begin with an observation about quadratic polynomials. Let g = y*+Ay+B € F[x] 
be such that either A(g) = 0 or g is irreducible over F. Also let F C F(./a) be a 
quadratic extension where a € F. In part (a) of Exercise 12, you will show that 


(13.12) g splits completely over F(\/a) <=> aA(g) € F’. 
We next observe that h; = y? — cjy+c¢2 — @ has discriminant 
A(hi) = ¢} — 4(c2 — 8) = 48 +c} — 4c2 = (a1 +02 — 03 — 24)’, 


where the last equality is (13.9). Since 6; has a unique root 8 = aja2 + 0304 
in F, the proof of part (c) of Theorem 13.1.1 gives o € Gal(L/F) that maps to 
(1324) € Sy. Then o(a; + a2 — a3 — a4) = —(a) +02 — 03 — a4), which implies 
that either A(h,) =0 or h, is irreducible over F. Similarly, hy = y? — By +c4 has 
discriminant A(z) = 6? — 4c2, and using o and (13.11), we see that either A(A2) =0 
or Az is irreducible over F. Then (13.12) implies that 


hyhz splits completely over F (\/A(f)) 


13.13 
<=> A(f)(48 +7 —4c2), A(f)(8 — 4c2) € F?. 

If hyh splits completely over F(\/A(f)), then (13.13) and part (c) of Theo- 
rem 13.1.1 imply that Gal(L/F) ~ Z/4Z. Conversely, if Gal(L/F) ~ Z/4Z, then 
L contains a unique quadratic extension of F, which must be F ( A( f)) since 
A(f) ¢ F? by part (c) of the theorem. Since h; and hz split completely over L by 
part (b) of Exercise 12, they split over quadratic extensions of F contained in L. 
Hence hi hp splits completely over F (\/A(f)). rT 


The proof of Proposition 13.1.5 is based on [22]. Another method for handling 
part (c) of Theorem 13.1.1 is described in Exercise 13. 


Mathematical Notes 


The text contains several ideas that will be developed in subsequent sections. Here 
are some remarks to help us see what is involved. 


« Transitive Permutation Groups. We noted in the proof of Theorem 13.1.1 that 
G is a transitive subgroup of S4, since f is irreducible. Transitive subgroups of S,, 
will play a prominent role in this chapter and the next. For example, implicit in 
Theorem 13.1.1 is the following classification of the transitive subgroups of Sy. 
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Theorem 13.1.6 Up to conjugacy, the transitive subgroups of S4 are 


Ss, Aa, ((1324),(12)), ((1324)), ((12)(34), (13)(24)). 


Proof: Let GC S, be a transitive subgroup. If we can prove that G arises from the 
Galois action on the roots of some monic irreducible quartic over a field of charac- 
teristic # 2, then G is conjugate to one of the above five groups by Theorem 13.1.1. 

We will find the desired quartic polynomial using the methods of Section 7.4. 
When F = Q, the universal extension in degree 4, 


K = Q(61,02,03,04) CL = Q(x1,x2,%3,%4), 


is the splitting field of the universal quartic f = x4 — 0x3 + ox? — 03x-+04. Given 
a transitive subgroup G C S4 ~ Gal(L/K), the corresponding fixed field 


KCMCL 


satisfies Gal(L/M) ~ G. Also observe that L is the splitting field over M of f and 
that f is irreducible over M since G is transitive (this is Proposition 6.3.7). As noted 
above, we are now done by Theorem 13.1.1. rT 


In practice, a standard strategy for computing Galois groups is the reverse of what 
we did in this section. When considering irreducible polynomials f € F [x] of degree 
n, one first finds all transitive subgroups of S, up to conjugacy and then, for each 
such subgroup, determines criteria for the Galois group of f to be conjugate to that 
subgroup. This is the approach we will use for the quintic in Section 13.2. 


« Resolvents. The general theory of resolvents is based on the ideas of Lagrange 
discussed in Section 12.1. For 6¢(y), recall from the proof of Theorem 13.1.1 that we 
began in the universal case with x,x2 +x3x4 and constructed the Ferrari resolvent @(y) 
of the universal quartic f. Then specializing to f gave 0;(y). All of the resolvents 
considered in this chapter will be constructed similarly. 

Besides 6;(y), Theorem 13.1.1 needs to know whether or not A(f) € F?. This 
can be stated in terms of the resolvent polynomial y? — A(f), since A(f) € F? if and 
only if y? — A(f) has a root in F. 

In general, if a resolvent has a root in F, then this puts strong restrictions on the 
Galois group. For example, if y? — A(f) has a root in F, then the group G lies 
in Aq (be sure you understand why), and if @;(y) has a root in F, then the proof of 
Theorem 13.1.1 shows that some conjugate of G lies in ((1324),(12)). By combining 
information from different resolvents, we can obtain precise information about the 
Galois group. We will pursue these ideas in Section 13.3. 

We can also use resolvents to explain part (c) of Theorem 13.1.1. This part of the 
theorem says that if 8 € F is a root of 0;(y) and A(f) ¢ F?, then the Galois group is 
Dg when £ satisfies the condition that either 


AB +ci—4c2#0 and A(f)(48 +c? —4c2) ¢ (F*)? 
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or 
4B +c?—4ce2=0 and A(f)(6* —4c4) ¢ (F*)?. 


The first part of the condition implies that y? — A(f)(48 + c? — 4c2) has no roots in 
F. Because of the appearance of 8, we will call y? — A(f)(48 + c? — 4c2) a relative 
resolvent in Section 13.3. Similarly, the second part of the condition can be stated in 
terms of the relative resolvent y? — A(f)(@? — 4ca). 

Furthermore, A(f)(4@ + c? — 4c2) and A(f)(8? —4cq) are nonzero if and only 
if the corresponding relative resolvents have simple roots (as defined in Section 5.3). 
We will see in Section 13.3 that simple roots of resolvents or relative resolvents are 
needed in order to get useful information about the Galois group. 


« Diophantine Equations. So far, we have always had a fixed polynomial whose 
Galois group we wanted to determine. But if we let the coefficients of the polynomial 
vary, then the criteria of Theorem 13.1.1 lead to some interesting equations. We 
begin by revisiting an earlier example. 


Example 13.1.7 Let f = x4 + ax? + bx? +ax+ 1 € Q[x] and assume that f is irre- 
ducible over Q. By Example 13.1.3, the Galois group of f over Q is Z/2Z x Z/2Z 
if and only if (b+ 2)? — 4a? € Q”. The latter is equivalent to saying that 


(13.14) (b+ 2)? — 4a? = c? 
for some c € Q. If we write this as 
4a’ +c? = (b+2)’, 


then f = x*+ ax? + bx? +ax+1 € Q[x] has Z/2Z x Z/2Z as Galois group if and 
only if there is c € Q such that (x, y,z) = (2a,c,b+ 2) lies on the cone 


xe ty? = 2’, 


Hen : we have an equation where we only want solutions whose coordinates all lie 
inC€ This is an example of a Diophantine equation. Such equations are an important 
par of number theory. 

In Exercise 11 you will show that if f = x4 + ax? + bx? + ax+ 1 is irreducible 
with positive integer coefficients, then the Galois group is Z/2Z x Z/2Z if and only 
if there is c > O in Z such that (2a,c,b +2) is a Pythagorean triple, i.e, the integers 
2a,c,b+2 are the sides of a right triangle with b + 2 as hypotenuse. <p 


Here is a polynomial that leads to a more sophisticated equation. 


Example 13.1.8 Assume that f = x++x+b € Z[x] is irreducible over Q. This 
polynomial has discriminant 


A(f) = 256b? — 27 
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and resolvent 
O/(y) =y? —4by-1. 


It is easy to see that 6¢(y) is irreducible over Q, since its only possible rational roots 
are +] (be sure you can explain why). Thus the Galois group over Q is either S4 or 
Aq, and the latter happens if and only if 256b? — 27 € Q’. In fact, we can replace Q” 
with Z? since b € Z. Thus the Galois group is Aq if and only if there is c € Z such 
that the point (x,y) = (b,c) lies on the curve 


y? = 256x? — 27. 


This is an example of an elliptic curve. A famous theorem of Siegel asserts that such 
an equation has at most finitely many integer solutions (see [31]). So there are at 
most finitely many integers b such that f = x* +x +b has Galois group Ay over Q. 
This example can be extended in several ways. First, one could allow b to be a 
rational number. Then one seeks rational points on the above elliptic curve. Some 
of the deepest conjectures in number theory involve rational points on elliptic curves 
(see [31] for an introduction). Another direction would be to consider polynomials 
x4 +ax+b€ Z{x| with Galois group Aq. This problem is solved in [35] using methods 
from algebraic number theory. <p> 


Historical Notes 


The first person to give a systematic method for finding the Galois group of a 
quartic was F. Hack, in his unpublished 1895 dissertation. Many books and papers 
have addressed this problem—see the references in [18], to which one can add 
[Escofier] and [Garling]. Our version of Theorem 13.1.1 is based on [17]. 


Exercises for Section 13.1 


Exercise 1. Let f € F[x] be separable of degree n, and let a1,...,@n be the roots of f in a 
splitting field F C L of f. In Section 6.3 we used the action of the Galois group on the roots to 
construct a one-to-one group homomorphism ¢; : Gal(L/F) — S,. Now let §1,...,8n be the 
same roots, possibly written in a different order. This gives @2 : Gal(L/F) — Sn. To relate $y 
and 2, note that there is -y € S, such that 8; = a) for 1 <i <n. Now define the conjugation 
map ¥: Sn 4 Sn by 4(7) = y~' Ty. 

(a) Prove that ¢2 = 70 ¢). 

(b) Let G C S, be the image of ¢;. Explain why part (a) justifies the assertion made in the 

text that “if we change the labels, then G gets replaced with a conjugate subgroup.” 


Exercise 2. Prove that A, is the only subgroup of S, with 12 elements. 
Exercise 3. Explain carefully why (13.6) follows from Exercise 9 of Section 2.4. 


Exercise 4. Use Example 7.3.4 from Chapter 7 to show that (13.8) gives all subgroups of 
((1324), (12)) of order 4 or 8. 


Exercise 5. Let F be a field of characteristic 4 2, and let g € F [x] be a monic cubic polynomial 
that has a root in F. Prove that g splits completely over F if and only if A(g) € F’. 
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Exercise 6. This exercise is concerned with the proof of part (c) of Theorem 13.1.1. Let 

f(x) = x4 — 1x9 + c2x? — e3x-+ 4 a8 in the theorem. 

(a) Suppose that f has roots a1,a2,03,a4 such that a; + a2 — a3 — a4 = a1a2 — 0304 = 0. 
Prove that f is not separable. 

(b) Let 8 be aroot of the resolvent 6;(y). Use part (a) to prove that 48 +c? — 4cz and 6? — 4cq 
can’t both vanish when f is separable. 

(c) Suppose that 48 + c? — 4c2 = 0 in part (c) of Theorem 13.1.1. Prove carefully that G is 
conjugate to ((1324),(12)) if and only if A(f)(6? — 4ca) ¢ (F*)?. 


Exercise 7. In Exercise 18 of Section 12.1 you found the roots of f = x*+2x? —4x+2 € Q[y] 
using the formulas developed in that section. At the end of the exercise, we said that “this 
quartic is especially simple.” Justify this assertion using Theorem 13.1.1. 


Exercise 8. In Example 10.3.10, we showed that the roots of f = 7m’ — 16m? — 21m? + 8m+ 
4 € Q|m] can be constructed using origami. Show that the splitting field of f is an extension of 
Q of degree 24. By the results of Section 10.1, it follows that the roots of f are not constructible 
with straightedge and compass, since 24 is not a power of 2. 


Exercise 9. As in Example 13.1.3, let f = x* +ax? + bx? +ax+1 € F[x], and let a be a root 
of f in some splitting field of f over F. Show that a7! is also a root of f, and then use (13.5) 
to conclude that 2 is a root of the resolvent 6; (y). 


Exercise 10. As in Example 13.1.4, let f =x‘ +x? +d € F[x], where d ¢ F?. Compute 
A(f) and 6;(y). 


Exercise 11. In Example 13.1.7 we showed that if f = x* + ax? + bx? +.ax+1 € Z[x] is 

irreducible over Q, then its Galois group is Z/2Z x Z/2Z if and only if there is c € Q such 

that 4a? +c? = (b+ 2). 

(a) Show that c € Z, and use the irreducibility of f to prove that c #0. Hence we may 
assume that c > 0, so that (2a,c,b+ 2) is a Pythagorean triple. 

(b) Show that 3? + 4? = 57, 574 12? = 137, 7? + 24? = 25”, and 8” + 15? = 17° give two 
examples of polynomials with Z/2Z x Z/2Z as Galois group (two of the triples give 
reducible polynomials). 


Exercise 12. This exercise is concerned with the proof of Proposition 13.1.5. 
(a) Prove (13.12). 
(b) Prove that the two polynomials 4; and /2 defined in the proof of the proposition factor as 
hi = (y— (a1 +a2))(y — (a3 + aa) and hz = (y — a102)(y — 304). 


Exercise 13. Suppose that f € F[x] satisfies the hypothesis of part (c) of Theorem 13.1.1, 
and let a be a root of f. Prove that G ~ Z/4Z if f splits completely over F(a), and G ~ Dg 
otherwise. This gives a version of part (c) that doesn’t use resolvents. Since we can factor 
over extension fields by Section 4.2, this method is useful in practice. 


Exercise 14. Use Theorem 13.1.1 to compute the Galois groups of the following polynomials 
in Qix]: 

(a) xt +4x42. 

(b) x*+8x+ 12. 

(c) x*+1. 

(d) M4 P47 4x41. 

(e) x*-2. 
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Exercise 15. In the situation of Theorem 13.1.1, assume that 0¢(y) has a root in F. In the 
proof of the theorem, we used (13.5) and (13.7) to show that G is conjugate to a subgroup of 
Dg. Show that the weaker assertion that |G| = 4 or 8 can be proved directly from (12.17). 


Exercise 16. Consider the subgroups ((12),(34)) and ((12)(34), (13)(24)) of Sa. 

(a) Prove that these subgroups are isomorphic but not conjugate. This shows that when 
classifying subgroups of a given group, it can happen that nonconjugate subgroups can 
be isomorphic as abstract groups. 

(b) Explain why the subgroup ((12), (34)) isn’t mentioned in Theorems 13.1.1 and 13.1.6. 


13.2 QUINTIC POLYNOMIALS 


Polynomials of degree 5 have a richer Galois theory than those of degree 4. There 
are some obvious reasons for this: The computations are more complicated because 
the degree is higher, and the groups are more complicated because they need not be 
solvable. The surprise is that quintic equations have strong relations to many other 
areas of mathematics, including: 


e Geometry. The rotational symmetry group of the icosahedron is As, and geomet- 
rically defined invariants of this group action have consequences for the quintic. 

e Iteration. The “Galois theory” of Newton’s method for solving polynomial 
equations due to Doyle and McMullen [9] uses As in a crucial way. 

e Elliptic Functions. These functions arise in complex analysis and number theory, 
yet can also be used to find roots of quintics that can’t be solved by radicals. 


Because of such connections, quintic equations are the subject of entire books, in 
particular those by King [19], Klein [20], and Shurman [30]. 

The aims of this section are more modest. We will focus on computing the Galois 
group of a quintic and in particular on determining when a quintic is solvable by 
radicals. As we will see, this will involve some substantial mathematics. 


A. Transitive Subgroups of Ss. If a quintic polynomial f € F[x] is separable 
and irreducible, then its Galois group Gal(L/F) (where F C L is a splitting field) 
is isomorphic to a transitive subgroup of Ss. So we begin by determining these 
subgroups up to conjugacy. Our first result is elementary. 


Lemma 13.2.1 Let G C Ss be a subgroup. Then the following are equivalent: 
(a) G is transitive. 

(b) |G| is divisible by 5. 

(c) Gcontains a 5-cycle. 


Proof: For (a) => (b), recall that the order of an orbit divides the order of the group, 
by the Fundamental Theorem of Group Actions (Theorem A.4.9). Then we are done, 
since transitivity implies that {1,2,3,4,5} is an orbit of the action of G. 

The implication (b) => (c) is proved using the argument given in the discussion 
following (6.8) in Section 6.4. Finally, for (c) => (a), note that repeatedly applying 
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IODA BH IGHIs KIL nH ---. 
Transitivity follows immediately, since {i, i2, i3,i4,is} = {1,2,3,4, 5}. 7 


It turns out that we already know most of the transitive subgroups of Ss up to 
conjugacy. More precisely, we have the following subgroups: 
The full symmetric group Ss of order 120. 
The alternating group As of order 60. 
The cyclic group ((12345)) of order S. 
By Section 6.4, the one-dimensional affine linear group AGL(1,F,) is the group 
of order p(p — 1) consisting of maps i++ ai+ b where i,a,b € F, anda #0. If 
we set p = 5 and regard {1,2,3,4,5} as congruence classes modulo 5, then 


AGL(1,Fs) C Ss 


is a subgroup of order 20. In particular, translation by 1 (i++ i+ 1) is the 5-cycle 
(12345) and multiplication by 2 (i+ 23) is the 4-cycle (1243). Be sure you 
understand this. 


In Exercise 1 you will show that AGL(1,Fs) is generated by (12345) and (1243). 
Furthermore, (12345) is an even permutation while (1243) is odd. Hence 


(13.15) AGL(1,Fs5) MAs 


is a proper subgroup of AGL(1,Fs) containing ((12345)). The group (13.15) also 
contains (1243)? = (14)(23) (multiplication by 4—do you see why?). In Exercise | 
you will show that 


AGL(1,Fs) NAs = ((12345), (14)(23)) ~ Dio, 


where Do is the dihedral group of order 10. 

The subgroup (13.15) and the four subgroups described in the bullets give five 
subgroups of Ss, all transitive by Lemma 13.2.1 since their orders are divisible by 5. 
These groups fit together in the diagram 


Ss 
7 t 
AGL(I,Fs) As 
(13.16) + PZ 
AGL(1,Fs) As 
t 
((12345)) 


We now classify transitive subgroups of Ss up to conjugacy. 
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Theorem 13.2.2 Every transitive subgroup G C Ss is conjugate to one of the groups 
in the diagram (13.16). 


Proof: By Lemma 13.2.1, G contains a 5-cycle. Hence, replacing G by a conjugate 
if necessary, we may assume that (12345) € G. The key idea of the proof is to 
consider the number of cyclic subgroups of order 5 contained in G. 

First suppose that ((12345)) is the only such subgroup of G. Then we have 
g((12345))g—! = ((12345)) for all gE G. In the language of the Mathematical 
Notes to Section 7.2, this means that G is contained in the normalizer 


Ns, ({(12345))) = {g € Ss | g((12345))g7! = ((12345))}. 


In Section 14.1 we will more generally consider the normalizer 


Ns, ((12...p))) = {g € Sp | g((12...p))g7! = (12... p))}, 


where p is now any prime, and we will prove in Lemma 14.1.2 that 
Ns, (((12...p))) = AGL(1,F,). 


This is part of Galois’s brilliant analysis of which irreducible polynomials of prime 
degree are solvable by radicals. Rather than repeat the argument here, we will simply 
assume this result from Chapter 14. 

It follows that if ((12345)) is the only subgroup of G of order 5, then G is a 
subgroup of AGL(1,Fs). In Exercise 1 you will show that this implies that G is one 
of the groups 

((12345)), AGL(1,Fs)MAs5, or AGL(1,Fs). 


It remains to consider what happens when G contains more than one subgroup 
of order 5. In Exercise 2 you will prove that ((12345)) is a 5-Sylow subgroup of 
G. By the Third Sylow Theorem (see Theorem A.5.1 in Appendix A), the number 
of 5-Sylow subgroups of G is congruent to 1 modulo 5. Since we have more than 
one, we must have at least 6. Furthermore, each 5-Sylow subgroup has four 5-cycles, 
and any two such subgroups intersect in the identity. Since we have at least six 
such subgroups, G must contain at least twenty-four 5-cycles. Yet Ss has exactly 
twenty-four 5-cycles by Exercise 2. It follows that G contains all 5-cycles. 

We are almost done. The easily verified identity 


shows that G contains all 3-cycles. We know from Section 8.4 that As is generated 
by 3-cycles. Hence G contains As, so that G is As or Ss. This completes the proof. = 


In Exercise 3 we will give a more elementary version of the above argument that 
doesn’t use the Third Sylow Theorem. 

As acorollary of Theorem 13.2.2, we get the following criterion for an irreducible 
quintic to be solvable by radicals. 
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Corollary 13.2.3 Assume that f € F(x] is irreducible of degree 5 and that F has 
characteristic 0. Then f is solvable by radicals over F if and only if its Galois group 
over F is isomorphic to a subgroup of AGL(1,Fs). 


Proof: By Theorem 8.5.3, f is solvable by radicals if and only if its Galois group 
is solvable. The Galois group is isomorphic to a subgroup of Ss, but As and Ss aren’t 
solvable, by Theorem 8.4.5. So one direction of the corollary follows immediately 
from Theorem 13.2.2. For the other direction, we note that AGL(1, Fs) and hence all 
of its subgroups are solvable by Example 8.1.6 from Section 8.1. rT 


Section 14.1 will discuss Galois’s generalization of Corollary 13.2.3 in which 5 is 
replaced by an arbitrary prime p. 


B. Galois Groups of Quintics. Let f € F [x] be monic, separable, and irreducible 
of degree 5, where F is a field of characteristic 4 2. We will determine the Galois 
group of f over F using a discriminant, a sextic resolvent, and a factorization. 

The discriminant is the usual discriminant A(f), and the factorization will be 
described in the statement of Theorem 13.2.6. So let us turn our attention to the 
sextic resolvent. The idea is to find a polynomial 

he F\x),...,X5| 
with the property that 
AGL(1, Fs) = {0 € Ss |o-h=h}. 


Thus A should have AGL(1, IF) as its symmetry group. We will use the polynomial 


h=wu 


where 


U = X{ Xz + XH3 + X3X4 + H4N5 + X5X1 
(13.17) 
— XX3— X34X5 — X5X2 — X2X4q — X4X]. 


The signs are best explained using the diagram 


1 


(13.18) 


4 3 


In the formula (13.17) for u, the coefficient of xx; (where i 4 j) is —1 if i and j are 
connected by a line segment in (13.18) and +1 otherwise. Since (12345) rotates the 
star by 27/5 radians, it follows that (12345) -u =u. 
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On the other hand, (1243) takes (13.18) to the diagram 
2 


3 1 


Here, i and j are connected by a line segment if and only if they are not connected in 
(13.18). Hence (1243) interchanges all signs, so that (1243) -u = —u. 

It follows that h = u? is fixed by (12345) and (1243). Since these generate 
AGL(1,Fs), we see that A is invariant under AGL(1,Fs). This is the full symmetry 
group of h, as we now prove. 


Lemma 13.2.4 Let h = u?, where u is defined in (13.17). Then 
AGL(1,Fs) = {0 € Ss |o-h=h}. 


Proof: Let G={o€Ss|o-h=h}. Then AGL(1,Fs) C G by the above argument. 
If G were strictly bigger than AGL(1, Fs), then (13.16) would show that G = Ss, that 
is, h would be symmetric. However, observe that 


h= (xpxtxexg tess — xy —- P 


2.2, 12.2, 2.2 2 2 2 
= XP XQ + ZHZ Tp XZ + Wx xQx3 — WAXpx2x3 — Axpx2xZ °°, 


where terms involving x4 or x5 are not shown. This makes it easy to see that h is not 
symmetric. Thus G # Ss, which implies that G = AGL(1,Fs). rT 


By Exercise 4, left coset representatives of AGL(1, Fs) in Ss are 
(13.19) e, (123), (234), (345), (145), (125). 
Thus the orbit of Ss acting on A consists of 


hi =e-h=h, hy =(123)-h, hy = (234)-h, 
ha = (345) +h, As =(145)-h, he = (125) -h 


(be sure you can explain why). This enables us to form the universal sextic resolvent 


6 
(13.20) Ay) = []—Aa). 


i=l 


The methods of Section 12.1 imply that @(y) has coefficients in F[o1,02,03,04,05]. 
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Our given monic separable irreducible quintic can be written as 
f =x — ey x4 +09x9 — 03x? + 4x — 5 € F [x]. 


Let its splitting field be F C L, and let a1, a2,03,04, a5 be the roots of f in L. Under 
the evaluation map x; +> a;, we know that 0; +> c; € F. Thus (13.20) maps to the 
sextic resolvent of f 


6 


(13.21) O(y) = ][o-A) € FI, 


i=} 
where 
Bi = hi(a1,02,03,04,Q5) € L. 


The structure of 6¢(y) is described by the following proposition, whose proof we 
defer until later. 


Proposition 13.2.5 Given f € F [x] as above, its sextic resolvent can be written 
O¢(y) = (y? + boy? + bay + bg)” — 2 ACF) y, 


where bz, b4,be € F. 


The Galois group of f is Gal(L/F) for a splitting field L of f over F. Also recall 
that Gal(L/F) ~ GC Ss, where G is transitive. Here is our main result. 


Theorem 13.2.6 Assume that f € F [x] is monic, separable, and irreducible of degree 

5 and that F is a field of characteristic 4 2. Then the subgroup G C Ss defined above 

has the following properties: 

(a) GCAs if and only if A(f) € F?. 

(b) G is conjugate to a subgroup of AGL(1,Fs) if and only if the sextic resolvent 
6;(y) defined in (13.21) has a root in F. 


(c) Gis conjugate to ((12345)) if and only if f splits completely over F(a), where 
a is a root of f. 


Proof: Part (a) follows from Theorem 7.4.1 since F has characteristic 4 2. Also, 
part (c) is relatively straightforward and is left as Exercise 5. 

It remains to prove part (b). If G is conjugate to a subgroup of AGL(1,Fs), 
then relabeling the roots if necessary, we may assume that GC AGL(1, Fs). Let an 
arbitrary o € Gal(L/F) correspond to r € G. Then 


o(B1) = o( h(a, a2, 03,04, 05)) 
= h(o(a1),0(a2),0(a3),0(a4),0(as)) 
= A( (1) 7 (2) Ar(3)1 Ar (4) @7(5)) 
= (7 -A)(a1, 02, 03,04,05) = h(a1,02,03,04,45) = fi, 
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where the last line follows from 7 € GC AGL(1,Fs) and Lemma 13.2.4. Since 
F C Lisa Galois extension, we must have 6, € F. Thus @¢(y) has a root in F. 

Conversely, suppose that 0¢(y) has a root in F. If G is conjugate to a subgroup of 
AGL(1, Fs), then we are done. So assume that this is not true. Since G is transitive, 
Theorem 13.2.2 implies that As C G. Let 7),...,7 denote the 3-cycles in (13.19) 
that satisfy 7;-h = h;, and let 0; € Gal(L/F) map to 7;. The existence of o; follows 
from As C G. Then arguing as above shows that 


o;(B1) = Bi, i=1,...,6 


(be sure you can supply the details). By assumption, some (; € F, and then the above 
equations easily imply that 8; = --- = 85. Hence the sextic resolvent is 


9s(y) = (y— i)®. 
Comparing this with Proposition 13.2.5, we obtain the identity 
(13.22) (y—B1)° = (y° + bay’ + bay + be)? — 2° A(f)y, 


where (3), 62,b4,b6 € F. Multiplying this out and comparing the coefficients of y°, 
y*,and y° gives the equations 
—6B; = 2b, 
158? = b2 + 2ba, 
—206? = 2byrb4 + 2b¢. 


In Exercise 5 you will verify these equations and use them to show that 
by =-36;, by =38i, bs = -Bi, 
since F has characteristic 4 2. Then (13.22) becomes 
(y— B1)® = (y? + bay? + bay + be)? — 2° A(f)y 


= (y? — 3B1y? + 36?y — 83)? — 2° A(f)y 
= (y—Bi)®-2° A(f)y. 


Hence 2!°A(f) =0. Yet F has characteristic 4 2, and A(f) 40, since f is separable. 
This contradiction completes the proof of the theorem. 7 


The structure of the sextic resolvent plays a crucial role in the above proof. So it 
remains to prove Proposition 13.2.5. 


Proof of Proposition 13.2.5: We will prove the proposition in the universal case and 
then specialize. As in the proof of Theorem 13.2.6, let 7; be the 3-cycle in (13.19) 
with 7;-4 = h;. Then set 

uj=Tju, 
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where u is defined in (13.17). This gives the polynomial 


6 


e(y) = [[o-4). 


i=l 
Using Maple or Mathematica and the methods of Section 2.3, one computes that 
(13.23) O(y) = y°+ Boy4 + Bay? + Bo —2>VAy, 
where V/A = Ti<ic j<s(%i — X;) is the square root of the discriminant, and 
By = 80103 —30%3 — 2004, 


Ba 


303 - 60,0303 + 160703 + 160203 + 16070204 _ 8a304 

~ 1120;0304 + 24002 — 640}05 + 240010205 — 4000305 
Be = 8010303 — 08 — 16070203 — 160303 + 640,020; — 6404 

(13.24) - 16070304 + Bato, + 6407020304 —1 120,030304 

— 128070304 + 224070304 — 640402 + 224070203 

— 1760303 — 64010303 + 32003 + 48010305 

- 19207020305 - 80030305 + 64000305 + 384070405 

— 64001020405 — 1600030405 — 16000707 + 40000203. 


You will do this computation in Exercise 6. The lovely ideas that underlie the 
formulas in (13.23) and (13.24) are explained in Exercises 7 and 8. 
Since h; = u?, O(y) relates to the universal sextic resolvent 6(y) as follows: 


Combining this with (13.23), we see that 6(y*) is the product 
(yo + Boy* + Bay? + Bs — 2° VAy)(y° + Boy’ + Bay? + Bo + 2°VAy), 
which easily implies that 
6(y”) = (y° + Boy* + Bay? + Be)? —2!°Ay?. 
Replacing y” with y, we obtain the universal formula 
6(y) = (y* + Boy? + Bay + Be)? — 2! Ay. 


Then the evaluation 0; + c; gives the desired formula for 6;(y). 7 
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The proof just given shows that 0/(y) = (y? + bay? + bay + be)? —2'°A(f) y, where 
b2, bg, bg are obtained from (13.24) by replacing o; with c;. This will be useful in the 
examples computed below. 

We can also describe the irreducible factorization of 0;(y) as follows. 


Proposition 13.2.7 Let f € F[x] and G C Ss be as in Theorem 13.2.6. Then: 


As CG <> 0(y) is irreducible over F, 


G is conjugate toa yan O;(y) = (y— B)aly), where BC F 
subgroup of AGL(1,Fs) and g(y) € Fly] is irreducible over F. 


Proof: You will prove this in Exercises 9 and 10. a 


C. Examples. We first note that Theorem 13.2.6 and diagram (13.16) lead to the 
following table for determining the Galois group Gal(L/F) ~ GC Ss: 


Is A(f) | Does 6(y) have Does f(x) split G up to 

in F?? arootinF? | completely over F(a)? conjugacy 
No Ss 
Yes As 
No AGL(1,Fs) 
Yes AGL(1,Fs)NAs 
Yes ((12345)) 


In this table, a denotes a root of f, and the dashes in the third column indicate 
cases when the first two columns determine the Galois group. You will prove the 
correctness of the table in Exercise 11. 

Here are three examples of how to use this table. 


Example 13.2.8 In Section 6.4 we showed that the Galois group of f = x° — 6x+3 
over Q is Ss. We can verify this as follows. In Exercise 12 you will show that f is 
irreducible with discriminant 


A(f) = -1737531 
and sextic resolvent 
67(y) = (y? + 120y? + 8640y ~ 69120)? + 2!°1737531y. 


You will also show that 67(y) is irreducible over Q. Since A(f) is not in Q’, the 
table implies that the Galois group is Ss. <p 


We will return to this example in Section 13.4. We next give an example that uses 
the third column of the table. 
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Example 13.2.9 Consider f = x° — 2 € Q(V5){x]._ In Exercise 13 you will show 
that f is irreducible over Q(Vv5) with discriminant 


A(f) = 50000 = 2*55 
and sextic resolvent 
Or(y) = y® — 2'°50000y = y® — 2'455y. 


This obviously has a root in Q(/5). Since A(f) is a square in Q(/5), the table 
tells us that the Galois group is either cyclic of order 5 or dihedral of order 10. To 
distinguish these, we need to check if f splits completely when we adjoin a root to 
Q(V5). But f can’t split completely over Q( V5, 2), since its roots aren’t all real. 
Hence the Galois group of f over Q(/5) is AGL(1,Fs) NAs ~ Dio. <> 


In Exercise 13 you will redo this example using results from Chapters 6 and 7. 
Example 13.2.10 A quintic polynomial studied by De Moivre and Euler is 
f=rtpet+ 1Lp’x+q € Q{yl. 


We will assume that f is irreducible over Q. In Exercise 14 you will show that f has 


discriminant 5 22 | 5 5 22 
_ (4p? +3125q*)* _ (4p? + 31254 
A(f) = 3125 ~ 55 


and sextic resolvent 
2 3 2y2 
O¢(y) = (y _ T py” +1 lp*ty + 3 p° + 4000pq’) - 210 Ge ther) y, 


You will also verify that 6;(y) has a root y = 5p” € Q. Since A(f) ¢ Q’ (do you see 
why?), the table implies that the Galois group is AGL(1, Fs). <I> 


In Exercise 14 you will give an elementary proof that the polynomial given in 
Example 13.2.10 is solvable by radicals. 


D. Solvable Quintics. We first point out the following immediate consequence 
of Corollary 13.2.3 and Theorems 13.2.6. 


Corollary 13.2.11 Assume that f € F(x] is monic and irreducible of degree 5 and 
that F has characteristic 0. Then f is solvable by radicals over F if and only if its 
sextic resolvent 0¢(y) has a root in F. = 


As an application of this corollary, we will determine when an irreducible quintic 
of the form 


f=x+axt+be Fix] 


is solvable by radicals. We will assume that F has characteristic 0. Here is the 
somewhat surprising result. 
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Theorem 13.2.12 Assume that f = x° +ax+b € F|x| is irreducible, where a # 0 
and F has characteristic 0. Then f is solvable by radicals over F if and only if there 
are X, 4 © F such that 


3125Ap4 3125A\p5 


#* (= 1402-60425)” (A= 1402 — 6A 4.25) 


Proof: In Exercise 15 you will show that f has discriminant 
A(f) = 256a° + 3125b* 
and sextic resolvent 
6(y) = (y? — 20ay? + 240ay + 320a3)? — 2!°(256a° + 3125b*)y. 


By Corollary 13.2.11, f is solvable by radicals if and only if @;(y) has a root in F. 

Since a # 0, a root 8 € F of @;(y) can be written 8 = ad for \ € F. We can also 
write b = ay for some ys € F. With these substitutions, it follows that 6;(y) has a 
root in F if and only if there is A € F such that 


0 = 6;(ad) 
= ((ad)} — 20a(ad)? + 240a7(ad) + 320a*)” — 2!9(256a5 + 3125(aps)*) (aA), 
which (after some algebra) simplifies to 
0 =2!7a°((A° — 10\* + 55A* — 140° + 175.” — 106 + 25)a — 3125Ap'). 
Since a # 0, this is equivalent to 


3125\u4 
Ne — 10A5 + 55A4 — 1403 + 1752 — 106 + 25 
7 31254 
~ (X= 1)4(02 — 6A + 25)’ 


a= 


where the factorization of the denominator is easily done in Maple or Mathematica. 
Using this and b = aps, we get the desired formulas for a and b. ] 


We will say more about this result in the Historical Notes. 
Mathematical Notes 
There are three topics we need to discuss further. 


= Resolvents. The sextic resolvent appearing in Theorem 13.2.6 used h = u?, where 
uw is given in (13.17). This resolvent appears in [Chebotarev] and [36], for example. 
The paper [1] contains an especially nice discussion of how A relates to the star 
diagram (13.18). 
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However, h = u? is not the only possibility. One can instead use 


2,2 2 2 2 
0, = XXX HX X3BXG A XZ XZ + XZ XGX5 + x3x1%5 


2 
+ x3x2x4 + xxix + xpx3.x5 + x2x1X4 + x8x2X3. 


This leads to the sextic resolvent found in [5] and [10]. 

The table used in Examples 13.2.8-13.2.10 distinguishes between the groups 
((12345)) and AGL(1,F5)NAs by factoring f over F(a), where a is a root of f. 
A natural question is whether this can be done with resolvents. The answer is yes, 
with a small complication. For example, if G C AGL(1,F5) OAs, then the algorithm 
given in [5] computes 


d = (102 (a2 — a1) +203(a3 — a2) + a304(a4 — 3) 
2 
+ agas(as — a4) + a5ay(a) — as)) . 

One can prove that d € F and that if d # 0, then G = ((12345)) if and only if d € F?. 

The latter condition is equivalent to the resolvent y? —d having a root in F. Also 
note that d # 0 guarantees that the resolvent is separable. 

The problem is that d = 0 can occur. When this happens, one performs a Tschirn- 
haus transformation to change f into a polynomial for which d 4 0. We will say 
more about Tschirnhaus transformations in the Historical Notes. This complication 
is why we used the factorization of f over F(a) in part (c) of Theorem 13.2.6. 


= Radical Solutions. When a quintic is solvable by radicals, it is natural to want 
explicit formulas for the roots. These can be complicated. 


Example 13.2.13 Using our methods, it is straightforward to see that the Galois 


group over Q of f = x° + 15x + 12 € Q[x] is isomorphic to AGL(1,Fs). In [1] and 
[33], it is shown that 


(3 —21VIO) (=e atv” 


125 125 
+ (225 Tavi0y 8 , (25 +Tavidy'* 
125 125 
is a root of f. <P 


For an arbitrary solvable quintic, an algorithm for writing down the roots explicitly 
is described in [10] and [21]. For the special case of solvable quintics of the form 
x° +ax +b, the solutions are described in [1] and [33]. See also [23]. 


» Normal Forms. A quintic of the form x° + ax + b is said to be in Bring—Jerrard 
normal form. When F has characteristic 0, it can be shown that an arbitrary quintic in 
F |x] can be transformed into Bring—Jerrard form using a Tschirnhaus transformation, 
though in order to do so, one might need to replace F with a solvable extension. A 
procedure for doing this is described in [Dehn], [Postnikov], and [38]. In Exercise 16 
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you will show that in characteristic 5, not all quintic polynomials can be put into 
Bring—Jerrard form. 

Two Bring—Jerrard quintics x° + ax +b and x° +a'x+b’ in F[x] are equivalent if 
there is \ € F* such that 


x ta'x+b' = *((Ax)5 +a(Ax) +d). 


Hence the roots of x° + a’x + b’ are the roots of x° + ax+ b multiplied by \~!. Using 
a form of Theorem 13.2.12, [34] shows that when F = Q, there are infinitely many 
inequivalent Bring—Jerrard quintics over Q. 

However, if we switch to quintics of the form x +ax? +b, there is a similar notion 
of equivalence, but here, [34] shows that up to equivalence, there are only five such 
quintics. The argument reduces to finding x,y € Q such that 


2_ 13 89 2 8 I 
YOHX + To9% + 35% + 35: 


This is an elliptic curve (see the Mathematical Notes to Section 13.1) with only 
finitely many solutions over Q. 

One can show that an arbitrary quintic in characteristic 0 can be transformed into 
a Brioschi quintic 

x — 10Wx? +45W?x — W?. 

The procedure for obtaining this normal form is described in [30, Ch. 5]. As for the 
Bring—Jerrard form, a solvable extension of F may be required to obtain the Brioschi 
form (see [30, Fig. 5.9.1]). The surprise is that the Brioschi quintic is deeply related 
to Section 7.5. This is because the rotational symmetries of the icosahedron give an 
extension K C C(t) (¢ a variable) with Galois group As. The books [20] and [30] 
explain how a complete understanding of this extension enables one to find the roots 
of any quintic polynomial. 


Historical Notes 


The first serious attempt to find the roots of polynomials of degree n > 5 is due to 
Tschirnhaus in 1683. His idea was to simplify 


x" + an—1x"'!+---+a9 =0 
by a substitution of the form 
(13.25) y=botbixt- +d, 1x" 4, 


now called a Tschirnhaus transformation. Eliminating x from these two equations 
gives an equation in y of degree n that can be significantly simpler if bo,...,,—1 are 
chosen carefully. 


Example 13.2.14 Given x* +3x+ 1 € Q|x], consider the Tschirnhaus transformation 
y=a+bx+x?. In Exercise 17 you will show that eliminating x leads to the equation 


(13.26) y+ (6 —3a)y? + (94 3b +3b’— 12a +3a’)y + P(a,b) =0, 
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where P(a, b) is a polynomial in a and b. You will also show that the coefficients of 
y? and y vanish if and only if a and b satisfy 


a=2 and b*+b-1=0. 


If we pick b = (5 — 1)/2, then the above cubic becomes 


375-1 
ya ye ot 


In Exercise 17 you will use this to solve the original cubic. Note that the resulting 
Tschirnhaus transformation is defined over a degree 2 extension of Q. <p> 


Tschirnhaus transformations can be used to solve all cubic and quartic equations. 
They fail in degree 5, though Bring in 1786 and Jerrard in 1834 showed that an arbi- 
trary quintic can be put into Bring—Jerrard form using Tschirnhaus transformations 
defined over suitable solvable extensions of the original field. 

The quintic polynomial x° + px? + 4px +q from Example 13.2.10 was solved by 
De Moivre in 1706. The polynomial reappears in a 1764 paper of Euler devoted to 
algebraic equations. In this paper, Euler writes an equation of degree 5 as 


x =A + Bx? +Cx+D. 


(Can you explain why he omitted the x+ term?) Let the roots be a, 8,y,6,€ in some 
splitting field. Euler was seeking a formula of the form 


(13.27) a= 8+ BYE 4+ EVR 4 OV, 


where v is a root of an equation of degree < 4, and similar formulas for £3, -y, 6,¢ with 
the radicals multiplied by suitable fifth roots of unity. Euler shows that this strategy 
works for polynomials of degrees 2, 3, and 4, but for degree 5 he succeeds only in 
some special cases. 

More precisely, he shows that if certain of the coefficients &, 4,@, J are zero in 
(13.27), then the original quintic reduces to one of the three special forms: 


p= D, 
(13.28) x* =5Px?+50x+Q?/P+ P?/Q, 
x = 5Px3 —5P?x4+ D. 


We can analyze the resulting Galois groups as follows. Assume that the polynomials 
of (13.28) have coefficients in Q and are irreducible. Then the Galois group of the 
first polynomial is isomorphic to AGL(1, Fs) by Section 6.4, and the same is true 
for the third polynomial by Example 3.2.10. You will show in Exercise 18 that the 
Galois group is also AGL(1, Fs) for the second polynomial in (13.28). 

However, if we adjoin the fifth roots of unity (standard procedure in the eighteenth 
century), then you will show in Exercise 18 that the first two equations of (13.28) 
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have cyclic Galois group over Q(¢;) while the third has Galois group isomorphic to 
AGL(1,F5)MAs. Details of what Euler did can be found in [25]. 

The first sextic resolvents for the quintic are due to Lagrange, Malfatti, and 
Vandermonde around 1770. Lagrange used 


z=2 (x} (xox5 + x3x5) +13 (x03 + x4x5) +23 (x2x4 + 145) 
$ xf (xgx5 + x1x2) +28 (xy04 + 2043 )) + 
3 (xy (xdx5 + 13.42) + x2 (xf x5 + x9x$) +.x3(xhxg +275) 
+ x4(13.x3 + xfxg) + x5(xtxg +2323). 


Lagrange was led to this using the ideas discussed at the end of Section 12.1. The 
polynomial z is invariant under AGL(1, Fs) and is the root of a sextic resolvent 


2°— Az? + Bz*—Cz3 4+ D2? — Ez+F. 


Lagrange computes A explicitly in terms of 01,02,03,04,05. The formulas are 
similar to (13.24), except that Lagrange computed them by hand (no computers back 
then!). Rather than continue with B,C, D,E, F, Lagrange comments that they can be 
computed by similar methods and goes on to say: 

But we shall not insert here such details which, besides that they would require 

very long calculations, would moreover not cast any light on the resolution of 

equations of the fifth degree. 
(See (Lagrange, p. 342].) For Lagrange, getting a resolvent of degree 6 was not 
helpful, since he wanted to reduce to equations of degree smaller than 5 (which we 
now know to be impossible). The irony is that one can use Lagrange’s sextic to obtain 
a criterion similar to Corollary 13.2.11 for deciding which quintics are solvable by 
radicals. (This was done by Galois—see below.) 

Malfatti’s sextic resolvent is closely related to Lagrange’s. He computed all of 
its coefficients in terms of the elementary symmetric polynomials and knew that the 
quintic was solvable by radicals when the sextic had a rational root. The converse 
was proved by Luther in 1847. The resolvent h = u? used in the text is due to Jacobi 
in 1835. He was the first to prove (13.23). In 1861, Cayley independently discovered 
this resolvent and related it to the star diagram (13.18) (see also [1]). 

Galois was naturally the first to think about this in terms of the Galois group. 
He showed that Lagrange’s sextic resolvent has a rational root if and only if the 
corresponding quintic is solvable by radicals. We will see in Section 14.1 that Galois 
also generalized this to irreducible polynomials of prime degree. 

Theorem 13.2.12 about when a Bring—Jerrard quintic is solvable by radicals was 
first proved by Runge in 1885. In the same year, Glashan and Young published 
different versions of the same result. A modern proof appears in [33]. See also [23]. 

More on the history of the quintic equation can be found in [27], [39], and [40]. 


Exercises for Section 13.2 


Exercise 1. As explained in the text, we can regard AGL(1, Fs) as a subgroup of Ss. 
(a) Prove that AGL(1, Fs) is generated by (12345) and (1243). 
(b) Prove that AGL(1,Fs) MAs is generated by (12345) and (14)(23). 
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(c) Prove that the group of part (b) is isomorphic to the dihedral group Dio of order 10. 
(d) Prove that ((12345)), AGL(1,Fs) As, and AGL(1,Fs) are the only subgroups of 
AGL(1, Fs) containing ((12345)) . 


Exercise 2. This exercise will consider some simple properties of Ss. 

(a) Prove that ((12345)) is a 5-Sylow subgroup of Ss; and more generally is a 5-Sylow 
subgroup of any subgroup G C Ss containing ((12345)). 

(b) Prove that 5S; has twenty-four 5-cycles. 


Exercise 3. Let G C Ss be transitive, and let N be the number of subgroups of G of order 5. 
In this exercise, you will use an argument from [Postnikov] to prove that N = | or 6 without 
using the Sylow Theorems. Let C = {7 € Ss \G|7 is a 5-cycle}. 
(a) Prove that 0-7 = ara! defines an action of G onC. 
(b) Let r € Ss be a 5-cycle. Prove that o € Ss satisfies ora ' = 7 if and only if o € (7). 
(c) Use parts (a) and (b) to prove that |G| divides |C]. 
(d) Prove that 4N + |C| = 24. 
(e) Use parts (c) and (d) to prove that N = | or 6. 


Exercise 4. Prove that (13.19) gives coset representatives of AGL(1, Fs) in Ss. 
Exercise 5. Complete the proof of part (b) of Theorem 13.2.6. Then prove part (c). 


Exercise 6. In this exercise, you will use Maple or Mathematica to prove (13.23) and (13.24). 

(a) The first step is to enter (13.17) and call it, for example, ui. Then use substitution 

commands and (13.19) to create u2,...,u6. For example, u2 is obtained by applying 
(123) to ul. In Maple, this is done via the command 


u2 := subs({xi = x2,x2 = x3,x3 = x1}, ul); 
whereas in Mathematica one uses 
u2:=ul /. {x1—> x2,x2—> x3,x3—> x1} 


(b) Now multiply out O(y) = (y—utl)---(y—u6) and use the methods of Section 2.3 to 
express the coefficients of O(y) in terms of the elementary symmetric polynomials. 
(c) Show that your results imply (13.23) and (13.24). 


Exercise 7. Consider AGL(1,Fs) MAs C Ss, and let u be defined as in (13.17). 
(a) Prove that the symmetry group of u is AGL(1,Fs)MAs. 
(b) Prove that (13.19) gives coset representatives of AGL(1, Fs) MAs in As. 


Exercise 8. Let u),...,u6 be as in the proof of Proposition 13.2.5, and let 7 € Ss be a 
transposition. 
(a) For each i, prove that 7 -uj = —u; for some j. 


(b) Let @(y) = []}_,(y — u)) and write this polynomial as 
O(y) = y° + Biy” + Bay* + Bsy” + Bay” + Bsy + Bo. 


Use part (a) to show that 7-8; — (—1)'B; fori=1,...,6. 

(c) Explain how part (b) and the results of Chapter 2 imply that the coefficients B2,B4, Be are 
polynomials in 01,02,03,04,05. This explains why the formulas (13.24) exist. 

(d) Use Exercise 3 of Section 7.4 to show that the coefficients B),B3,Bs must be of the form 
BVA, where Bisa polynomial in o1,02,03,04,05. 
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(e) Note that A has degree 10 as a polynomial in x1, x2,x3,x4,x%5. By considering the 
degrees of B,,B3,Bs5 as polynomials in x; ,x2,x3,X4,Xs, show that part (d) implies that 
B, = B; = 0 and that Bs is a constant multiple of A. This explains (13.23). 


Exercise 9, This exercise will prove the first equivalence of Proposition 13.2.7. 

(a) First suppose that 0,(y) is irreducible. Prove that |G] is divisible by 6, and explain why 
this implies that As C G. 

(b) Now suppose that As C G. Prove that Gal(L/F) acts transitively on §1,..., 8. However, 
we don’t know that £1,..., 8 are distinct. 

(c) Let p(y) be the minimal polynomial of 8; over F. By part (b), it is also the minimal 
polynomial of 62,..., G6. Prove that 6¢(y) = p(y)”, where m = 1, 2, 3, or 6. The proof of 
Theorem 13.2.6 shows that m = 6 cannot occur, and m = | implies that 0;(y) is irreducible 
over F. It remains to consider what happens when m = 2 or 3. 

(d) Show that (y’ + ay? + by +c)? = 6;(y) implies that A(f) = 0. Hence this case can’t 
occur. 

(e) Show that (y? + ay +b)? = 6¢(y) implies that 4b = a’, and then use this to show that 
A(f) =0. 


Exercise 10. This exercise will prove the second equivalence of Proposition 13.2.7. Note that 
one direction follows trivially from Theorem 13.2.6. So we can assume that G C AGL(1, Fs) 
and that 6¢(y) = (y— B1)g(y) where 6) € F. 
(a) Use (12345) € G to prove that Gal(L/F) acts transitively on 82,..., 66. Asin the previous 
exercise, we don’t know if 82,..., 8 are distinct. 
(b) Let p(y) be the minimal polynomial of 62 over F. By part (a), it is also the minimal 
polynomial of /33,..., 8. Prove that 0¢(y) = (y— 81)p(y)”, where m = 1 or 5. Ifm=1, 
then we are done. So we need to rule out m= 5. 
(c) Show that (y— 61)(y — 62)° = 6¢(y) implies that G; = 62, and then use this to show that 
A(f) =0. 
Exercise 11. Show that the table preceding Example 13.2.8 follows from the diagram (13.16) 
and Theorem 13.2.6. 


Exercise 12. Let f = x° —6x+3 € Q|x] be as in Example 13.2.8. Compute A(f) and 6¢(y) 
and show that 6¢(y) is irreducible over Q. 


Exercise 13. Let f = x° —2 € Q(V/5) [x] be as in Example 13.2.9. 
(a) Compute A(f) and 6;(y). 
(b) In Section 6.4 we showed that the Galois group of f over Q is isomorphic to AGL(1, Fs). 
Use this and the Galois correspondence to show that the Galois group over Q(v5) is 
isomorphic to AGL(1, Fs) NAs. 


Exercise 14. Let f = x° + px’ + ip’x+q € Q|x] be as in Example 13.2.10, and assume that 
f is irreducible over Q. 

(a) Compute A(f) and 6;(y). 

(b) Factor @r(y) € Q|x], and conclude that 5p” € Q is a root of 8y{y). 

(c) Show that the substitution x = z— £ transforms f into 2° — £5 +4. 

(d) Use part (c) to give an elementary proof that f is solvable by radicals over Q. 


Exercise 15. As in Theorem 13.2.12, let f = x° + ax-+. Compute A(f) and 6;(y). 


Exercise 16. Let f = x° + ax+6 € F[x], where f is separable and irreducible and F has 
characteristic 5. The goal of this exercise is to prove the observation of [28] that the Galois 
group of f over F is solvable. 


(a) 
(b) 


(c) 


(d) 
(e) 
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Prove that a # 0. 

Use Exercise 5 from Section 6.2 to show that the Galois group of f over F is cyclic when 
a=—-—l. 

Show that there is a Galois extension F C L with solvable Galois group such that f is 
equivalent (as defined in the Mathematical Notes) to a polynomial of the form x° —x +b’ 
for some b' € L. 

Conclude that the Galois group of f over F is solvable. 


Show that there is a field F of characteristic 5 and a monic, separable, irreducible quintic 
g © F [x] that cannot be transformed to one in Bring—Jerrard form defined over any Galois 
extension F C L with solvable Galois group. 


In [28] Ruppert explores the geometric reasons why things go wrong in characteristic 5. 


Exercise 17. Following Example 13.2.14, consider the equations x’ + 3x+ 1 = 0 and y = 
a+bx+x’, 


(a) 
(b) 


(c) 


Use Maple or Mathematica and Section 2.3 to eliminate x and obtain (13.26). 

Show that coefficients of y? and y in (13.26) both vanish if and only if a = 2 and 
B+b-1=0. 

The equation for y becomes trivial to solve when a = 2 and b = (/5—1)/2. We could 
then solve for x using y = a+ bx-+ x’, but there is a better way to proceed. Note that 


x = —bx’ —axtyx 


follows from y = a+ bx+.x?. Furthermore, we can use y = a+ bx +x’ to eliminate the x’ 
in the above equation. Then use x° + 3x-+ 1 = 0 to obtain an equation in which x appears 
only to the first power. Solving this gives a formula for x in terms of y. The general 
version of this argument can be found in {Lagrange, p. 223]. 


Exercise 18. This exercise is concerned with the polynomials (13.28). As in the Historical 
Notes, we will assume that they lie in Q[x] and are irreducible. 


(a) 
(b) 


(c) 


Show that ¢/0?/P + (P/Q)</@2/P’ is a root of x5 — 5Px? — 5Qx— Q?/P — P°/Q. 
Prove that the Galois group of x° — 5Px? — 5Qx — Q”/P — P?/Q over Q is isomorphic to 
AGL(1, Fs). 


Prove that over Q(v5), the first two polynomials of (13.28) have cyclic Galois group 
while the third has Galois group isomorphic to AGL(1, Fs) MAs. 


Exercise 19. Use the methods of this section to compute the Galois group over Q of each 
of the following polynomials. Be sure to check that they are irreducible. Remember that in 
Section 4.2 we learned how to factor polynomials over a finite extension of Q. 

(a) 4x41. 

(b) x° + 20x+ 16. 

(c) x° +2. 


(d) 


x — 5x4 12. 


(e) P4x4 — 40° —3x? 43x41. 


Exercise 20. In the Mathematical Notes to Section 10.3, we noted that the roots of the 
polynomial x5 — 4x4 + 2x? + 4x” + 2x —6 € Q|x] can be constructed using a marked ruler and 
compass. Show that this polynomial is not solvable by radicals over Q. 
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13.3. RESOLVENTS 


So far, we have explained how to compute Galois groups of polynomials of degree 4 
or 5. It is time to turn our attention to polynomials of higher degree. We will see that 
generalizations of the resolvents used in Sections 13.1 and 13.2 lead to a systematic 
strategy for computing Galois groups. 


A. Jordan’s Strategy. In Theorem 13.1.1 we used the Ferrari resolvent to help 
determine the Galois group of an irreducible polynomial of degree 4, and the sextic 
resolvent played a similar role in Theorem 13.2.6 for polynomials of degree 5. This 
strategy for computing Galois groups was first described by Jordan in 1870: 
The path to follow to treat this question [computing the Galois group] will be 
the following: 1° one will form the various groups of the possible substitutions 
G,G’,... among the roots of the equation; 2° let G be one of these groups, chosen 
at will: one will affirm for oneself whether or not it contains the group of the 
equation by forming a function y of the roots, invariant under the substitutions 
of G and variable for other substitutions, calculating by the method of symmetric 
functions the equation [the resolvent] that has for roots the various values of y, 
and looking for a rational root. Among the groups of the series G,G’,... that in 
this way contain the group of the equation, the smallest will be the group itself. 
(See [Jordan 1, p. 276].) The sextic resolvent @¢(y) used in Theorem 13.2.6 follows 
this model nicely: for G = AGL(1, Fs), we have y = h = u*, whose symmetry group 
is precisely G, and the universal sextic #(y) is the polynomial “that has for roots the 
various values of yy,” which when evaluated at the coefficients of an irreducible quintic 
f € F{x| gives the resolvent 6;(y). By Theorem 13.2.6, the question of whether the 
Galois group of f is conjugate to a subgroup of AGL(1, Fs) is equivalent to “looking 
for a rational root,” i.e., a root of 6¢(y) in F. 
As the discussion of the sextic 0(y) reveals, Jordan’s description is not perfect. 
In fact, it omits three important things: 
e First, one needs to distinguish between the resolvent in the universal case and its 
specialization to the given polynomial. 
e Second, having a rational root only implies that G contains the Galois group up to 
conjugacy. 
e Third, this can fail if the rational root is not simple. 
We will say more about these items below. Nevertheless, Jordan’s description is 
remarkably close to some of the modern methods used to compute the Galois group 
of an irreducible separable polynomial f € F [x] of degree n. Let Gy C S, correspond 
to the Galois group of f over F. Thus 


Gal(L/F) = Gy C Sp, 


where F C L is a splitting field. In earlier sections, Gy was called G, but in the 
discussion that follows, G will instead denote an arbitrary transitive subgroup of S,,. 
Here is the step-by-step process for determining Gy. 


Step 1: Classify Groups. Transitive subgroups G C S,, have been classified up to 
conjugacy for n < 32 (see [4] and [15]). Published tables [7] go up to n = 15. 
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Step 2: Find Polynomials. For each G C S, from Step 1, we need to find a polynomial 
y in x1,...,X, whose symmetry group is H(y) = G. Stauduhar’s 1973 paper [36], 
which pioneered the modern approach to Jordan’s strategy, lists a polynomial yy for 
each transitive subgroup of S4, Ss, and S7 (and Sg with some errors noted in [13]). 
We will follow the standard convention that ip has coefficients in Z. 


Step 3: Compute Resolvents. Take G and y from Step 2. Following the method of 
Sections 13.1 and 13.2, we compute the resolvent O(y) in the universal case, write its 
coefficients in terms of the'elementary symmetric polynomials, and then specialize 
to the coefficients of f. In terms of the roots a1,...,Q, of f, the resolvent is 


O;(y) = (y— i (a4,.--,0n)) +++ (Y—Gm(ar,-.-,An)), 


where | = %, Y2,..-; Ym is the orbit of y under the action S, (so m is the index of G 
in S,). The problem is that the universal resolvent might be huge. 

When F = Q, we can avoid this difficulty as follows. Suppose that f € Z[x] is irre- 
ducible of degree n. (Exercise 1 will explain why we can restrict to polynomials with 
integer coefficients.) Then compute accurate numerical approximations aj,...,a7 
of the roots of f, and multiply out the approximate resolvent 


(13.29) (y—yvilay,...,0%))---(y-gn(az,...,a%)). 


However, since the true resolvent has integer coefficients in this case (you will prove 
this in Exercise 2), it follows that if we have computed the approximate roots a; 
accurately enough, then the true resolvent is obtained from the approximate one 
by rounding its coefficients to the nearest integer. Doing this rigorously requires a 
careful understanding of the numerical issues involved. See [36] for the details. 


Example 13.3.1 Consider f = x° — 6x +3 € Q{x] from Example 13.2.8. To 16 
decimal places, the approximate roots of f are 

ai = —1.6709352644808655592, 

az = —0.1181039225949867235 — 1.587459 1621207593640i, 

a3 = —0.1181039225949867235 + 1.5874591621207593640i, 

a4 = 0.5055012304055246668, 

as = 1.4016418792653143394. 


Evaluating the polynomials h; = u? from Section 13.2 at these numbers gives 


By = —43.4376362799772861 + 28.6930156587206645i, 
B2 = —71.5507381341784308 — 94.8067689529853707i, 
B3 = —71.5507381341784308 + 94.8067689529853707i, 
Bg = —5.0116255858442831 + 9.9920056672183422i, 
Bs = —5.0116255858442831 — 9.9920056672183422i, 
Bo = —43.437636279977 2861 — 28.6930 156587206645i 
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as approximate roots of the sextic resolvent 6;(y). It follows that (13.29) becomes 


O(y) = y® + 240.0000001 y> + 31680.00001y* + 1935360.001y° 
+ 58060800.02y* + (584838144.3 — 0.07i)y 
+ 4777574402 + 0.1109586026i. 


However, multiplying out the formula for 6(y) given in Example 13.2.8 shows that 


O;(y) = y® +240y> + 31680y4 + 1935360y? 
+ 58060800y" + 584838 144y + 4777574400. 


Looking at the constant term, we see that our approximation is not good enough. 
Hence we need to increase the accuracy of the roots of f. <b> 


The above calculation was done in Maple; Mathematica gives a similar result. 
The moral is that you need to know what you are doing when working numerically. 
Other methods for computing resolvents are discussed in [16]. 


Step 4: Use Resolvents. Suppose that f € Fx] is irreducible and separable of degree 
nand that G C S, and y with H(y) = G give the resolvent © ¢(y) € F[y]. We can use 
this to determine the location of the Galois group Gy C S, as follows. 


Proposition 13.3.2 Let f € F[x] be separable and irreducible of degree n. 
(a) If Gy is conjugate to a subgroup of G, then © ¢(y) has a root in F. 
(b) If O¢(y) has a simple root in F, then Gy is conjugate to a subgroup of G. 


Proof: Recall from Section 5.3 that a root of O(y) is simple if the corresponding 
linear factor appears exactly once in the factorization of O;(y) over a splitting field. 

If G, is conjugate to a subgroup of G, then Gy C G follows by suitably relabeling 
the roots a1,...,@, of f. Then Gy C G easily implies that y(a1,...,@,) is invariant 
under Gal(L/F) and hence lies in F (be sure you can supply the details). 

Conversely, let 8 € F be a simple root of O;(y). By relabeling the roots of f, we 
may assume that 6 = y(a,...,Qn). If Gp ¢ G, then there is t € Gy such that t ¢ G. 
Then 7 - y # ¥, so that the resolvent may be written 


Os(y) = (y—plar,...,@n)) (y— (7 -p)(an,---,0n)) 
= (y—(ai,... ,Qn)) (y ~ y(ar(1))-++s@r(n))) nthe 
You will prove in Exercise 3 that 8 = y(a1,...,@,) € F and 7 € Gy imply that 
p(ay,...,Qn) = P(Q7(1))--+5Ar(n))s 


which is impossible because 8 = y(a1,...,@n) is a simple root of O;(y). This 
contradiction proves that Gr C G. a 


For irreducible quartics, the Ferrari resolvent used in Theorem 13.1.1 is separable 
when f is separable, and for irreducible quintics, the same is true for the sextic 
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resolvent used in Theorem 13.2.6. Hence all roots of these resolvents are simple. 
But in general, resolvents can have multiple roots. Here is an example. 


Example 13.3.3 Let n = 4 and g = VA (x 4+. x2 — x3 — x4). In Exercise 4 you will 
verify that the symmetry group of y is G = ((1324)) C S$, when F has characteristic 
# 2. Thus the corresponding resolvent ©s(y) has degree 6. For the polynomial 
f =x++bx?+4d,d ¢ F*, from Example 13.1.4, you will show in Exercise 4 that 


Ory) =y?((y? + 4bA(f))? — 2°dA(f)’). 


This has the rational root 0 € F, yet we showed in Example 13.1.4 that the Galois 
group is not contained in ((1324)) when d(b — 4d”) ¢ F?. So Of(y) fails to give 
accurate information about the Galois group, because 0 is not a simple root. <-> 


This example shows the importance of the word “simple” in Proposition 13.3.2. 


Step 5: Repair Resolvents. Resolvents computed by the above process can fail 
if their rational roots are not simple. To fix this, the standard method is to use a 
Tschirnhaus transformation (see the Historical Notes to Section 13.2) to change f to 
a different polynomial g. In [14], it is shown that this can always be done in such a 
way that f and g have the same Galois group and the corresponding resolvent ©,(y) 
is separable. See also [1, Algorithm 6.3.4]. Then redo Step 4 with g and ©,(y). 


Aside from some clever tricks, this method for computing Galois groups is the 
basis of the algorithm used in [5] for polynomials of degrees 4, 5, 6, and 7. However, 
we will see below that the galois command in Maple computes Galois groups using 
a slightly different approach. 


B. Relative Resolvents. The idea of a relative resolvent was introduced in Sec- 
tion 12.1 inthe universal case. Relative resolvents also are implicit in Theorem 13.1.1, 
as we will now explain. 


Example 13.3.4 According to Example 13.3.3, ((1324)) C S4 is the symmetry group 
of p= VA (x1 + x2 — x3 — x4) in characteristic #2. In Exercise 4 you showed that 
in the universal case, — leads to the universal resolvent 


3 
@(y) = [] 0? - 4 (4y:+ 0? —402)), 


i=] 


where y; = x1.x2 +4314, Y2 = X1.X3 +.%2.%4, ¥3 = X1X4 +.X2%3 are the roots of the universal 
Ferrari resolvent 6(y). If f = x* — yx? +. cox? — 3x +4 € F [x] is irreducible and 
separable and has roots a), @2, 03, @4, then as usual x; +> a; gives the resolvent 


3 
(13.30) (vy) = [] 0? - AY) (46; + cf — 4e2)), 


i=1 


where (3, 82,83 are the roots of the Ferrari resolvent @;(y). 
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Suppose that we have already computed 6,(y) and found that it has a rootin F. As 
usual, we may assume that 8; = 0102+ a304 € F, so that Gy C ((1324),(12)). To 
decide if Gy lies in the subgroup ((1324)), we could use the above resolvent © ¢(y). 
But since we already know ;, we could instead use the factor 


(13.31) y? — A(f) (48; +c? — 4c2) € Fly], 


which is an example of a relative resolvent. If (13.31) has a simple root in F, then 
Gy = ((1324)) follows by the relative version of Proposition 13.3.2. 

Note that (13.31) has a simple root in F when 48; + c? —4c2 # Oand A(f) (48; + 
c? — 4c2) € F’, just as described in part (c) Theorem 13.1.1. Do you see how part (c) 
handles the case when the relative resolvent (13.31) has a multiple root? <<? 


The general theory of relative resolvents is described in [16] and [36]. Their main 
advantage is that they have smaller degree and hence are easier to compute. You will 
prove a version of Proposition 13.3.2 for relative resolvents in Exercise 5. 


C. Quartics in All Characteristics. The treatment of quartics in Section 13.1 
assumed characteristic # 2. Here, inspired by Keith Conrad [6], we use the ideas 
of this section to compute the Galois group of a monic irreducible separable quartic 
polynomial f = x4 — c) x3 +c2x? —c3x+c4 € F |x] for any field F. Asin Section 13.1, 
f has roots a1, 2,03, a4, discriminant A(f), and Ferrari resolvent 0¢(y). 

The main problem concerns the discriminant. In characteristic 4 2, Theorem 7.4.1 
tells us that Gy C Ag if and only if ,/A(f) € F. This fails in characteristic 2 since 


A(f) = (a1 — a2) (a1 — 3) (01 — 04)(O2 — 003) (C22 — 044) (23 — 4) 
=(ay+ a2) (04 + 03)(04 + 04) (2 + 03)(O2 + as) (a3 + a4) 


is clearly invariant under the Galois action and hence lies in F. 
In terms of symmetric polynomials, the problem is that the symmetry group of 
VA=TI, <i<j<4(%i — Xj) depends on the characteristic. We will replace VA with 


32 3,2 3,21 52 2 2 2,3 
D= ; O + Xp XQXZ = XpAGXZ + XY XAXA + xixox3 + xixdxg + xpxexg + .xX9X3x4 + 
aeAs 324 y3y. 2 3,2 234 y2y. 73 2,3 
XXX qT AZAZHG HP AAZAY A XyAZXq + Xp A3Xq + XQAZXQ. 


In Exercise 6 you will show that in characteristic 4 2, 
D = $(010203 — 30704 — 30% + 40204) + 1/A 


and that in all characteristics, A = D — D’, where D’ = (12)-D. Thus D is “half” 
of VA (the positive terms in characteristic #4 2). By part (b) of Exercise 9 of 
Section 12.1, the symmetry group of D is Aq in all characteristics. 

Now let D(f) = D(a),02,03,a4) and D'(f) = D'(a,02,a3,a4). The S4-orbit 
of D consists of D and D’, so that we get the quadratic resolvent polynomial 


(13.32) Dy(y) =(y-D(f))(y-D'(f)) = —Ay+B, 
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where 


A = c|€2C3 — 3c¢c4 _ 3c3 + 4c2c4, 
B= c3c3 + cic} _ 6c1c2¢3 + 9c} + cic3cg - Ache, - 6c3e203¢4 + 22¢)c303C4 


+ 6c7e3cq4 — 42c2c%c4 + 9c1c2 — 42czerc% + 36c3c3 + 48c103¢2 — 64c}. 


The formula for A follows from Example 2.2.4, and the formula for B can be computed 
using the methods of Section 2.3. 

Using Dy(y) and 6+(y), we have the following preliminary result about the Galois 
group of f. Recall that Gal(L/F) ~ Gy C S4, where L is a splitting field of f over F. 


Theorem 13.3.5 Let f € F(x] be monic, separable and irreducible of degree 4. Then 
the subgroup Gy C Sg is determined as follows: 
(a) If 6¢(y) is irreducible over F, then 


G= Ss, if Dg(y) is irreducible over F 
f Ag, if Dr(y) splits completely over F. 


(b) If f(y) splits completely over F, then Gp ~ Z/2Z x Z/2Z. Furthermore, 0 f(y) 
splits completely over F if and only if 0;(y) and Ds(y) are reducible over F. 

(c) If O¢(y) has a unique root in F, then Gy is isomorphic to either Z/4Z or Dg (the 
dihedral group of order 8). Furthermore, 0;(y) has a unique root in F if and 
only if 0¢(y) is reducible over F and Df(y) is irreducible over F. 


Proof: We first study the case when Gy C Aq. The resolvent D¢(y) from (13.32) has 
discriminant (D(f) — D'(f))*, which is A(f) since (as noted above) V/A = D—D’. 
It follows that Dy(y) is separable since f is. Hence the roots of Dr(y) are simple, 
and then Proposition 13.3.2 implies that 


(13.33) Gp C Ag <=> Dy(y) has a root in F. 


The proof of part (a) is now identical to what we did in Theorem 13.1.1, except 
that we use (13.33) rather than Theorem 7.4.1 to determine whether Gr = Aq or Sq. 

For the rest of the proof, suppose that 6;(y) is reducible over F. Recall from 
the proof of Theorem 13.1.1 that f and @;(y) have the same discriminant. Since 
f is separable, it follows that the same is true for @¢(y). We can assume that 
B = 002 + 0304 is a root of 6¢(y) in F. As in the earlier proof, this implies that Gy 
is one of the three groups 


((12)(34), (13)(24)) = Z/2Z x Z/2Z, ((1324)) ~ Z/4Z, ((1324),(12)) = Dg. 


If Dr(y) is reducible over F, then Gy C Ag by (13.33), which implies that Gp = 
((12) (34), (13)(24)) by the earlier proof. Since these permutations fix the other roots 
01013 + A204, 2104 + A203 Of O¢(y), it follows that 0(y) splits completely over F. 

On the other hand, if Dy(y) is irreducible over F, then Gy Z Ag by (13.33), which 
forces Gy to be ((1324)) or ((1324),(12)). Both of these groups contain (1324), 
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which takes a)a3 + A204 to @)a4 +0203. Hence 8 = a1a2 + 34 is the only root 
of 0(y) contained in F since 6(y) is separable. 
From here, parts (b) and (c) of the theorem follow easily. | 


To complete Theorem 13.3.5, we need to distinguish between Z/4Z and Dg in 
part (c) of the theorem. The criterion given in part (c) of Theorem 13.1.1 fails in 
characteristic 2. The reason is twofold: First, A({) is always a square in F in this 
case, and second, characteristic 2 implies that 


4B+ct-—4e.=c3 


is also a square in F. However, by replacing A(f) with D;(y), we get the following 
version of Proposition 13.1.5 that works in all characteristics. 


Proposition 13.3.6 As in Theorem 13.3.5, let f =x*—c, x9 +e2x? —c3x+c4 € F[x}. 
Also assume that 0;(y) has a unique root 8 € F. Then G; is isomorphic to either 
Z/4Z or Dg, and the former occurs if and only if (y? — c\y +.¢2 — B)(y? — By +4) 
splits completely over F (D(f)). 


Proof: Let L = F(a;,a2,03, a4) be the splitting field of f over F. If Gp ~ Z/4Z, 
then L contains a unique quadratic extension of F, which must be F (D( f )) since 
D;(y) = (y-—D(f))(y — D’(f)) is irreducible over F by part (c) of Theorem 13.3.5. 
However, y* — cyy+.c2 — 6 and y” — By +4 split completely over L since 


y? —c1y te2— B = (y— (a1 +.a2))(y — (a3 +.04)), 


(13.34) 3 
y’ — By +4 = (y—aya2)(y — a304) 


by Exercise 12 of Section 13.1. Hence they split completely over the unique quadratic 
extension of F contained in L. It follows that (y? — cyy +.c2 — B)(y? — By +c4) splits 
completely over F (D(f)). 

On the other hand, suppose that G; ~ Dg. By the proof of Theorem 13.1.1, we 
may assume 8 = a) a2 + 0304 and Gy = ((1324),(12)). Let o € Gal(L/F) map to 
(13)(24) € Gy. Then o fixes D(f) since (13)(24) is even. Hence 


(13.35) a is the identity on F(D(f)). 


Now suppose that (y? — ciy+ cz — 8)(y? — By+c4) splits completely over F (D(f)). 
Then (13.34) implies that a; + a2, a1Qa2 € F(D(f)). Since o(a1 + a2) = a3 +04 
and o(a1Q2) = a304, we conclude from (13.35) that 


Qa) +a2=az3+a4 and aja2 = 0304. 
Part (a) of Exercise 6 of Section 13.1 implies that fis not separable. This contradiction 
implies that (y? — cy ++¢2 — 8)(y? — By + ca) does not split completely over F (D(f)) 
when Gy ~ Dg. . 


Proposition 13.3.6 appears in [6]. Here is an example from the same paper. 
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Example 13.3.7 Let F = k(u), where u is a variable and k has characteristic 2. Let 
f=xtt(utl)x?+uxt+1e Fix. 


In Exercise 7, you will verify the following: 

e f is irreducible and separable over F’. 

e Drly) Hy? +wy+w +e +H. 

© Oy) =y? + (ut Dy? +? = (yt+u)(y? +y+u). 

Hence 8 = u, so that the quadratic polynomials of Proposition 13.3.6 are given 
by y?-O-y+(ut1)-u=y?+1=(yt+1) and y?-uy+1=y*?+uy+1. In 
Exercise 7 you will show that y* + uy + 1 has no roots in the splitting field of D;(y). 
Then Gy ~ Dg by Proposition 13.3.6. <p> 


One can also distinguish between Z/4Z and Dg using relative resolvents. In 
characteristic # 2, A(x) + x2 — x3 — x4) has symmetry group ((1324)) and gives 
the relative resolvent y? — A(f)(48 + c? — 4c) described in Example 13.3.4. But 


(13.36) p= xpxdxg + xbxdx + gating toxdxtey 


has symmetry group {(1324)) in all characteristics. In the situation of part (c) of 
Theorem 13.3.5, we have Gy C ((1324),(12)) ~ Dg, and to decide whether or not 
Gy = ((1324)) ~ Z/4Z, we can use the relative resolvent coming from (13.36), 
namely 

(y—)(y— (12) -y) = y? —Ay+B, 


where @ € F is aroot of @¢(y) and 


A = B(c1c3 — 2c4) — or - cic4 + 2c2c4, 
B= B( 2 2 —4 2 2 4 _ 3 3 2 2 
= B* (c2c3 + cjc2c4 — 4c5c4 + €4) + B(4c102¢€3¢4 — C103 — C1 0304 — 2¢2€4) 


2,2 2 4,2 2 2.24 4 
+ 2cic3cq4 — 8e2c3c4 + Cfe4 — Bcjc2c3 + 17 ¢3c4 + c3. 


By Proposition 13.3.2, we get Gp = ((1324)) when y? — Ay + B has a simple root 
in F. This is the approach taken in [36]. The problem is that y? — Ay + B may fail 
to have simple roots, which as mentioned earlier in the section requires the use of 
Tschirnhaus transformations. Hence Proposition 13.3.6 is better for our purposes. 


D. Factoring Resolvents. So far, we have asked whether resolvents have a 
rational root. But there are situations where the irreducible factorization of a resolvent 
can be useful, even if none of the factors have degree 1. Here is an example. 


Example 13.3.8 Let f be an irreducible separable quartic, and consider the sextic 
resolvent O,(y) given in (13.30). Assume that ©,(y) is separable. In Exercise 8 
you will prove that Gy is conjugate to ((1324), (12)) if and only if O¢(y) = g(y)A(y), 
where g(y),A(y) € F[y] are irreducible of degrees 2 and 4, respectively. <> 
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A much more interesting example involves the group GL(3, F2) of invertible 3 x 3 
matrices with entries in F,. In Section 14.3 we will see that GL(3,F) is a simple 
group of order 168. The smallest non-Abelian simple group is As of order 60, and 
one can prove that GL(3,F,) of order 168 is the next smallest. 

Following [11] and [32] we will show that 


g=x!—154x+99€ Qlx| 


is an irreducible polynomial whose Galois group over Q is isomorphic to GL(3, F2). 
Our tool will be the factorization of a resolvent of degree 35. 

First observe that GL(3,F2) acts on the eight-element vector space F} by matrix 
multiplication. The origin 0 = (0,0,0) is fixed, but the seven nonzero vectors get 
permuted. In Exercise 9 you will show that labeling these vectors v),...,v7 induces 
a one-to-one group homomorphism 


(13.37) GL(3,F) > 5). 


For simplicity we will identify GL(3,F,) with its image under this map and hence 
regard GL(3, F,) as a subgroup of $7. 

Now consider y = x; +x. +.x3 € Q[x,...,x7]. If we are given a polynomial 
f € Q(x] of degree 7 with roots a;,...,a7, then we get the resolvent 


e,y)= [] OW-itaj+ax)) € QI. 


I<Si< jckS7 


There is one factor for each three-element subset of {1,...,7}, so that O,(y) has 
degree (3) = 35. Then we have the following interesting result of [32]. 


Proposition 13.3.9 Let f € Q\|x| be irreducible of degree 7, and let ©s(y) be the 
above resolvent of degree 35, which we assume to be separable. Then the Galois 
group of f over Q is isomorphic to GL(3, F2) if and only if © ¢(y) = g(y)h(y), where 
a(y),A(y) € Ql] are irreducible of degrees 7 and 28, respectively. 


Proof: First suppose that the Galois group of f over Q is isomorphic to GL(3,F)). 
The transitive subgroups of S;7 are known (see [3, pp. 206—209]) and in particular, 
any subgroup of S; isomorphic to GL(3,F2) is conjugate to the subgroup coming 
from (13.37). By relabeling the roots, we may assume that 


Gy = GL(3,F) c S37. 


Since O,;(y) is separable, its irreducible factorization is governed by the action of 
the Galois group on its roots, which is equivalent to the action of GL(3,F2) on 
three-element subsets of {1,...,7} (be sure you understand this). Hence we need to 
understand the action of GL(3, F2) on unordered triples of nonzero vectors of F}. 

A one-dimensional subspace of F} is a line through the origin, which consists of 0 
and a nonzero vector, since we are over F,. Hence there are seven such subspaces. In 
Exercise 10 you will show that F} also has seven two-dimensional subspaces, each 
of which consists of 0 together with three linearly dependent nonzero vectors. 
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It follows that of the 35 possible triples of nonzero vectors in F}, seven consist 
of linearly dependent vectors while the remaining 28 consist of linearly independent 
vectors. In Exercise 10 you will show that GL(3,F2) acts transitively on each of 
these sets of triples. As explained above, this describes the Galois action on the roots 
of ©;(y), and the desired factorization follows. 

The converse is proved in [32]. Let the Galois group of f be isomorphic to 
Gy C S;. Since ©;(y) is separable, the Galois action on its roots is equivalent to 
the action of Gy on three-element subsets of {1,...,7}. The conjugacy classes of 
subgroups S7 are known, and for each conjugacy class, one can compute the orbits of 
its action on unordered triples. These are listed in Table I in [32]. Inspection of this 
list shows that GL(3,F2) C S; is the only subgroup (up to conjugacy) such that the 
orbits have lengths 7 and 28. Thus Gy must be conjugate to GL(3, F.) when O;(y) 
has irreducible factors of degrees 7 and 28. a 


Here is the example mentioned earlier. 
Example 13.3.10 For f = x’ — 154x +99, [11] computes that 


Or(y) = y?> —6160y”? + 29898y8 — 38277624” — 41255676y" 
+ 37518228y"! + 18524283008y!” + 6522421752y'° 
+27295157736y!> + 35173338750y!4 — 2894923232432y!! 
+ 4895713801 44y'° — 4925879415072y? + 3933790086996y® 
—702099623709y’ + 149674336745472y° — 96219216479232y* 
— 257730044 14080y? + 21354775085952y* + 946763427456y 
— 1217267263872. 


Using Maple or Mathematica, the irreducible factorization of O (y) over Q is easily 
computed to be 


where 
g(y) =y’ —231ly? — 462y? + 77y + 66 
and hA(y) is the polynomial 
y8 +23 Ly*4 + 462y?3 — 6237y?2 + 29832y?! + 53361 y? 
+213444y!9 — 1245090y!® + 3958878y!7 — 11719092y!6 + 30817248y!> 
— 157564143y!* + [1 pt}]319312224y'3 — 796323990y'? + 148190611 8y!! 
— 299438 1313y!° + 5889443406y° + 965064177y® — 4595839182y" 
+ 33180883659y° — 84492127566y* + 181691003340y* 
— 382065796728y° + 152613801648y* + 3586225 1040y — 18443443392. 


By Proposition 13.3.9, the Galois group of f over Q is isomorphic to GL(3, F,). <> 
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The galois command in Maple computes the Galois group over Q of an irre- 
ducible polynomial of degree < 9 in Q[x]. The algorithm used by Maple involves 
factoring resolvents of the above type. See [32] for more details. 

The computer algebra programs GAP [12] and Magma [24] can also compute the 
Galois groups over Q. GAP can handle polynomials of degree < 15. Magma, on 
the other hand, has no a priori limitation on the degree of the polynomial, though 
computations for degrees > 50 are rarely successful. 


Mathematical Notes 
We will discuss two topics from this section. 


= Trinomials of Degree 7. In the text, we showed that the Galois group over Q of 
x? — 154x+99 is isomorphic to GL(3, F,). In the late 1960s, x’ —7x+3 was shown 
to have the same property. Bruin and Elkies [2] consider the problem of finding 
all trinomials f = ax’ + bx +c € Q|x] with GL(3, F)) as Galois group. We say that 
another trinomial g is equivalent to f if 


8 = X(a(ux)’ + (ux) +c) 


for some 4, 4 € Q*. By [2], equivalence classes of trinomials over Q whose Galois 
group is contained in GL(3, F) correspond to solutions (x,y) € Q” of the equation 


(13.38) y? = x(81x° + 396x* + 738x3 + 660x" + 269x + 48). 


By finding all solutions (x,y) € Q” (including points at infinity), one gets the result 
that, up to equivalence, the only trinomials ax’ + bx +c € Q|x] with GL(3,F2) as 
Galois group are 


x) —Tx43, x7 — 154x499, 377x7? —28x4+9, 4997x7 — 23956x + 37-113. 


Details and references can be found in [2]. Also, [11] and [32] include references to 
other papers on polynomials of degree 7 with GL(3, F,) as Galois group, 

The equation (13.38) is another example of a Diophantine equation. In contrast 
to Example 13.1.8, this is not an elliptic curve. Instead, it has genus 2 (while elliptic 
curves have genus 1). By Faltings’s proof of the Mordell Conjecture, it is known that 
equations of genus > 2 have at most finitely many rational solutions, i.e., solutions 
with (x,y) € Q*. But the proof is not constructive, so that in practice it can be difficult 
to prove that one has found all rational solutions. 


= Groups and Geometry. The group GL(3, F2) is important for reasons related to 
group theory and geometry. Let us begin with the group theory. In the Mathematical 
Notes to Section 11.1, we defined GL(n, F) to be the group of invertible n x n matrices 
with entries in the field F, This group contains the subgroup SL(n,F) of matrices 
of determinant 1. Furthermore, taking the quotient of each of these groups by the 
subgroup consisting of multiples of the identity matrix gives groups 


PGL(n,F) and PSL(n,F). 
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We will say more about these groups in the Mathematical Notes to Section 14.3. 
There, we will see that aside from PSL(2,F,) ~ S3 and PSL(2,F3) ~ Aq, the group 
PSL(n, F,) is simple whenever n > 2. 

In particular, PSL(3, F2) is simple. However, by Exercise 11 we have 


GL(3, F,.) = SL(3, F,) ~ PGL(3, F,) = PSL(3, Fs). 
This explains why GL(3, F2) is simple. In Section 14.3 we will also see that 
PSL(3, F2) ~ PSL(2,F)). 


For this reason, some papers, such as [11], use PSL(2, F7) instead of GL(3, F2). 

For any field F, GL(3,F) and PGL(3, F) have interesting geometric properties. 
For GL(3, F), the geometric object it acts on is the vector space F? of dimension 3 
over F. For PGL(3, F), the corresponding geometric object is the projective plane P? 
over F. Although this is beyond the scope of the book, we will make one comment 
related to the proof of Proposition 13.3.9. There, we observed that F} has seven one- 
dimensional subspaces and seven two-dimensional subspaces. Once you understand 
the geometry of P? over F = F), this follows immediately from projective duality. 
More on projective geometry can be found in [29]. 


Exercises for Section 13.3 


Exercise 1. Let f(x) € Q|y]. 
(a) Prove that there are \, j2 € Q” such that g(x) = Af (px) € Z[x] is monic. 
(b) Prove that f and g have isomorphic Galois groups over Q. 


Exercise 2. Let f(x) = x" —cix""'+---+(—1)"cn € Z[x], and let @;(y) be the resolvent 
built from y € Z[x),...,x,]. Prove that O;(y) € Z[x]. 


Exercise 3. In the proof of Proposition 13.3.2, we asserted that 


plan yore Qn) = plar)s- oe 1Or(n)) 
follows from 8 = p(ai,...,Qn) € F and r € Gy. Prove this. 


Exercise 4. As in Examples 13.3.3 and 13.3.4, let p = VA (x1 +2 — x3 — x4). 
(a) Show that the symmetry group of y is G = ((1324)) C Sq in characteristic # 2. 
(b) Show that in the universal case, yy leads to the resolvent 


@(y) =] (? -A Gy +0? -402)), 


i=l 


where y) = x1x2 +.13%4, Yo = X13 +.x2%4, 3 = X1X4 + X2x3 are the roots of the universal 
Ferrari resolvent 6(y). 

(c) Let Qs(y) be obtained by specializing the resolvent @(y) of part (b) to f = x4 + bx? +d. 
Show that 


Oy(y) =y’ ((y? + 4bA(f))? — 2°dA(f)’). 


Exercise 5, This problem will state and prove a relative version of Proposition 13.3.2. Fix a 
subgroup H C S, and suppose that f € Fx] is separable of degree n and that G; C H. Now 
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let G C H be a subgroup. We want to know whether or not G, lies in the smaller subgroup G. 
Let y € F[x1,...,%n] have G as its symmetry group and let yi = y,42,..., pe be the orbit of 
H acting on y. Then set 


e 


0" (y) =] [0-99 € Fln,.-. ma]. 


i=l 


Finally, if a1,...,@n are the roots of f in a splitting field L, let 


be the polynomial obtained by x; > ai. 
(a) Explain why the degree of © (y) is the index of G in H. 
(b) Prove that OF (y) € Fly]. 
(c) Assume that Gy is conjugate within H to a subgroup of G (this means that rG;r—' C G 
for some 7 € H). Prove that ©7 (y) has a root in F. 
(d) Assume that @/(y) has a simple root in F. Prove that Gy is conjugate within H to a 
subgroup of G. 
We call @7 (y) a relative resolvent. You will verify in Exercise 12 that (13.31) from Exam- 
ple 13.3.4 is an example of a relative resolvent. 


Exercise 6. Let D = 37, <4, 0° x1%243 € F [x ,x2,x3,%4]. 
(a) Prove that D = (010203 — 30j04 — 303 +4204) + 4 VA in characteristic # 2. 
(b) Prove that VA = D— (12) - D in all characteristics. 


Exercise 7. As in Example 13.3.7, let f =x‘ +(u+1)x? +ux+1 € F[x], where F = F2(u). 
(a) Use Gauss’s Lemma and the Sch6nemann-—Eisenstein criterion to show that f is irreducible 
over F. (These results apply since F [u] is a PID.) 
(b) Verify the formulas for D;(y) and @;(y) given in Example 13.3.7. 
(c) Show that y? + uy-+ 1 is irreducible over the splitting field of Dr(y). 


Exercise 8. Let f € F [x] be an irreducible quartic, where F has characteristic # 2. Also let 

©r(y) be the sextic resolvent defined in Example 13.3.4. The goal of this exercise is to show 

that Gy C S4 determines the irreducible factorization of O(y) over F. We will assume that 

©,(y) is separable. 

(a) First suppose that Gy = Aq or S4. Prove that O;(y) is irreducible over F. 

(b) Now suppose that Gr = ((1324),(12)). Prove that O,(y) = g(y)h(y), where g(y), h(y) € 
Fx] are irreducible of degrees 2 and 4, respectively. 

(c) Suppose that Gr = ((12)(34), (13)(24)). Prove that @;(y) = g:(y)g2(y)g3(y), where 
gi(y) € F|x] is irreducible of degree 2. 

(d) Finally, suppose that Gr = ((1324)). Prove that @(y) = g1(y)g2(y)g3(y), where gi(y), 
g2(y), @3(y) € F [x] are irreducible of degrees 1, 1, and 4, respectively. 

(e) Explain why parts (a) through (d) enable one to determine Gy up to conjugacy using only 
©;(y) and A(f). 

Notice that the claim made in Example 13.3.8 now follows immediately. 


Exercise 9. The action of GL(3, F2) on the nonzero vectors of F} gives a group homomorphism 
GL(3,F2) > Sy. Prove that this map is one-to-one. 
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Exercise 10. Consider the vector space F}. 
(a) Prove that F} has exactly seven two-dimensional subspaces. 
(b) Fora field F, let B= {{v1, v2, v3} C F? | vi, v2, v3 are linearly independent over F }. Prove 
that GL(3, F) acts transitively on B. 
(c) Let F be as in part (b). Prove that GL(3, F) acts transitively on the set of two-dimensional 
subspaces of F?. 
Be sure you understand how parts (b) and (c) apply to the proof of Proposition 13.3.9. 


Exercise 11. Prove that GL(3, F2) = SL(3,F2) ~ PGL(3,F2) = PSL(3, F2). 


Exercise 12. Prove that (13.31) from Example 13.3.4 is an example of a relative resolvent in 
the sense of Exercise 5. 


Exercise 13. In the proof of Proposition 13.3.9, we showed that when GL(3,F2) C S7 acts on 
three-element subsets of {1,...,7}, the orbits have lengths 7 and 28. We also asserted that up 
to conjugacy, GL(3,F2) is the only subgroup of S7 with this property. In this exercise, you 
will study the action of some other subgroups of $7. 

(a) Prove that A7 and S7 act transitively on three-element subsets of {1,...,7}. Thus there is 
one orbit of length 35 for these groups. 

(b) In Section 13.2, the group AGL(1,Fs;) C Ss played an important role in understanding 
the Galois group of a quintic. In a similar way, we have AGL(1,F7) C S; provided we 
think of the indices as congruences classes modulo 7. Prove that the orbits of AGL(1, F7) 
acting on the triples {0,1,2} and {0, 1,3} have 21 and 14 elements, respectively. 


Exercise 14. The quadratic resolvent Ds(y) used in Theorem 13.3.5 to compute the Galois 
group of a quartic in all characteristics was defined for a polynomial f of degree 4. Here you 
will study what happens when f is monic of degree n. We begin with the polynomial 


n—1 n—-2 2 
D= > OX Xp Kg 2Xn—1 © Fly,.--,Xal; 
aEAn 


where F is a field of any characteristic. 
(a) Prove that A, is the symmetry group of D. 
(b) Prove that /A = [lice jen (ti — 4) satisfies VA = D—D’, where D’ = (12) -D. 
(c) Let f € F[x] be monic of degree n and let a,...,@n be the roots of f in some splitting 
field L. Then define D(f) = D(a,...,@n) and D’(f) = D’(au,...,@n), and set 


Dy(y) = (v- D(f)) (y—D'(f)). 


Prove that Dy(y) € Fy] and that the discriminant of Dy (y) is Af) = TT) <jc jen (@i —aj)’. 
Note that D(f) and D’(f) depend on how we order the roots while the polynomial D;(y) 
depends only on f. 

(d) Assume that f is separable and let Gal(L/F) ~ Gy C S,. Prove that Gy C A, if and only 
if Ds(y) splits over F. 

This gives a version of Theorem 7.4.1 that works in all characteristics. 


Exercise 15. Let f = x° — cix? +. cox —c3 € F[x] and let Dy(y) € Fly] be as in the previous 

exercise. 

(a) Show that D;(y) = y? — (ci¢2 — 3ea)y + 3 + e303 — 6e1¢2¢3 + 9c}. 

(b) Assume in addition that f is separable and irreducible. Explain how D,(y) determines 
the Galois group of f up to isomorphism. 

This gives a version of Proposition 7.4.2 that works in all characteristics. See [6] for some 

examples. 
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13.4 OTHER METHODS 


This section will explore further tools for computing Galois groups. We begin with 
a result of Kronecker that works in complete generality but is not very efficient. 
However, his method also leads to a quick proof of a result of Dedekind that uses 
reduction modulo p to obtain useful information about Galois groups over Q. 


A. Kronecker’s Analysis. In Section 12.3 we studied Kronecker’s construction 
of the splitting field of a separable polynomial f € F [x] of degree n. Let us recall 
how this works. 

Assume that F is infinite, and let a1,...,a, be the roots of f in a splitting field 
F CL. We saw in Section 12.2 that there are t,,...,f, € F such that the n! elements 


haga) t+ thon), aE Sn, 


are distinct. Thus 


(13.39) s(y) = II (y- (t1A5(1) + +++ + tnQg(n))) € LI] 
oESn 


is separable of degree n!. We showed in Section 12.3 that s(y) € Fly] and that if 
h(y) € Fly] is any irreducible factor of s(y), then the quotient 


Fly]/(h(y)) 


is a splitting field of f over F. It follows that the degree of A(y) is the order of the 
Galois group of f over F. 

This construction seems to require the roots @,...,@,. However, the universal 
version of s(y) is given by 


(13.40) Sty) = II (y ~ (trey toe + taxXaqny)) © Flxi,.-- Xn yI- 
o€S, 


The theory of symmetric polynomials tells us how to write S(y) explicitly as a 
polynomial in F[o),...,07,y]. Then specializing to the coefficients of f gives s(y) 
as in (13.39). Furthermore, Exercises 4 and 5 of Section 12.3 show how to pick 
t1,---;f, € F (without knowing the roots) so that s(y) is separable. 

Here is an example of this process. 


Example 13.4.1 Consider f = x? + x? —2x—1 € Q[x]. In Exercise 1 you will show 
that if we set f; = 1, t2 = 1, and f3 = 2, then the universal polynomial (13.40) becomes 
S(y) = y®—4o,y> + (207 + 1402)y* + (807 — 440,02 + 2003)y? 
+ (—To} + 180702 + 4903 — 400103)y” + (—407 + 440}02 
— 1120102 — 200703 + 1400203) + 40° — 320402 + 550703 
+ 3603 + 160303 — 3220,0203+ 34303. 
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Using 0) 4 —1,02 4 —2,03 + 1, we obtain 


s(y) = y® + 4y> — 26y4 — 76y3 + 193y? + 240y — 377 
= (y? + 2y? — 15y + 13)(y? + 2y? — 15y — 29), 


where the second line is the irreducible factorization in Q|y]. This shows that s(y) 
is separable (do you see why?). Thus the Galois group of f over Q has order 3. 
(This also follows from the theory of Chapter 9, since f is the minimal polynomial 
of 2cos(27/7) = 67+ G7 ',) <P 


Besides giving the order of the Galois group, Kronecker observed that by modify- 
ing the above construction, one can extract the entire Galois group from an irreducible 
factor of s(y). The idea is that instead of letting t),...,t, be elements of F, we let 
them be variables. To prevent confusion, we will label these variables uj,...,u, and 
write (13.39) as 


(13.41) suv) = [] (— (trae ry +++: + unca(ny)) € Llatt,--.5tnsy)- 
oéS, 


The subscript u is a reminder that s,(y) is a polynomial in the +1 variables 
Uj,..-,Un,y. In Exercise 2 you will show that the coefficients lie in F,, so that 


Su(y) € Flui,...,Un,y]- 


Furthermore, we can compute s,(y) by first working in the universal situation and 
then specializing to f. Thus we can find s,,(y) without knowing the roots of f. 
The polynomial ring F [u1,...,u,,y] has two key structures: 


@ Flu),...,U4n,y] has an S,-action given by permutations of u1,..., Un. 
© Fluj,...,Un,y] is a UFD. 
For the splitting field L of f, L[u;,...,un,y] has the same two structures plus a third: 


e Liuj,...,Un,y| has a Gal(L/F)-action given by the Galois action on L. 
As in the previous section, we write the Galois group of f over F as 


Gal(L/F) ~ Gp C Sp. 
The above structures, when applied to s,(y), give the following description of Gy. 


Theorem 13.4.2 Assume that f € F [x] is monic and separable of degree n, where 
F is an arbitrary field. Also let h € F|u),...,un,y| be an irreducible factor of the 
polynomial s,(y) € F[u1,...,Un,y] constructed above. Then Gr C Sp is conjugate to 
the subgroup 

G={reS,|T-h=h} cS. 


Proof: In Exercise 3 you will show that (13.41) is the irreducible factorization of 
Su(y) in L[uy,...,un,y]. Thus we can pick o € S,, such that 


(13.42) Y— (UjQg(1) +++ + Unde(n)) 
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is a factor of A in L[u1,...,U4n,y]. The permutation o will be fixed for the remainder 
of the proof. Our goal will be to prove that G = a~'Gro. 
Consider the polynomial 


h= Il (y- (ury(Qo1y) + °° + Uny(Oe(n)))) 
yEGal(L/F) 


= Il (y — (Wipe) +++ + UnOyo(n))): 
uEGy 


(13.43) 


Standard arguments imply that h is invariant under the action of Gal(L/F), so that 
h€ Flu,...,un,y] since F C L is Galois. We can relate h to h as follows. Pick any 
 € Gal(L/F). Since h has coefficients in F and (13.42) divides h in L[uy,...,Un,y}, 
it follows that 


y— (ury(@o(1)) tet UnY(Qa(n))) 
divides y-h =h in L[uj,...,un,y]. Hence h divides h in L[u),...,un,y], which by 
Exercise 3 implies that h divides hin F[u1,...,Un, y]. Since h is irreducible, it follows 
that h = h after multiplying / by a suitable constant. 
Now suppose that 7 € S, satisfies 7 -A = h. This implies that 7 applied to (13.42) 


is a factor of h in L[u1,..., Un, y]. Since (13.43) is the irreducible factorization of h in 
L|u1,.--,4n, yl, We must have 


Y ~ (Up(1)Qa(1) 72+ +Ur(n)Qo(n)) = ¥— (U1Ayo(1) +7+* + UnQ@po(n)) 
for some yz € Gy. Since u1,..., 4, are distinct variables, this implies that 


T(i) = jf => Ag(i) = Opel) 


> Agr-'(j) = Apo(j): 


It follows that or—! = ao, since @,...,Q, are distinct. Thus 


I 


t=0 'wioe a” 'Gyo. 


This shows that GC o~'Gyo. You will prove the opposite inclusion in Exercise 3, 
which implies G = a~'Gyo. This completes the proof of the theorem. . 


Theorem 13.4.2 gives an algorithm for computing the Galois group of /: 
e Compute s,(y) € Flui,...,Un,y]- 
e Factor s,(y) into irreducibles, and let / be an irreducible factor. 
e For each 7 € S,, compute 7 - and compare it with h. 
e Then the Galois group of f over F is isomorphic to {7 € S, | 7-h=h}. 
For n large, this algorithm is extremely inefficient. For example, s,(y) has degree 
10! = 3628800 in y when deg(f') = 10. Finding an irreducible factor h of s,(y) could 
take a long time. And even if we could find h, then we would need to compute 7 -h 
for all 3628800 permutations 7 € Sj9. Thus this algorithm is not useful in practice, 
although it is a completely general method for computing Galois groups. 
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Here is an example of how to use Theorem 13.4.2 when 7 is small. 


Example 13.4.3 Consider f = x? +x? —2x— 1 € Q[y]. In Exercise 4 you will show 
that s,(y) has the irreducible factorization 
Suly) = (u; + 3uru2 - 4ujus + u3 _ 4urus — Wy uQu3 + 3uzu3 + 3u\u3 
~ 4u2u? + ua + 2uiy — 3ujuzy + Quy — 3uju3y — 3u2u3y + 2ury 
~ uyy* —uzy” — u3y? — y?) x (up — 4uzu2 + 3uyuz + us + 3uru3 
— Wy UU; — 4usu3 - 4ujur + 3u2u? + ua + 2uiy — 3uju2y 
+ 2u3y — 3ujusy — 3ugusy + 2uby — uiy? — uy? — usy* —y*) 
in Q[uj,u2,43,y|. (This calculation was done in Mathematica.) Let h be the first 
factor multiplied by —1. You will show in Exercise 4 that 
h= y>+(u)+un+u3)y? + (7 (uju2 + uju3 + ugu3) —2(uy + u2+u3)")y 
+ Tun; — (uy tu2+ u3)> + T(ujus + urns + u2uU3). 


In this formula for h, everything is symmetric in u),u2,43 except for the last set of 
parentheses. Thus 7 € 53 fixes h if and only if 7 fixes 


us + uu; + u2Uu3. 
It follows easily that the group G of Theorem 13.4.2 is ((123)) C S3. <P> 
Note that f is not required to be irreducible over F. Here is an example. 
Example 13.4.4 For f = x? — 1 € Q{x], you will show in Exercise 5 that 


Suly) = (y? + (ur + U2 — 2u3)y + uj + U5 +3 — ju, — uyu3 —u2U3) 
x (y? + (u +3 —2u2)y tui + +03 — Wyu2 — yb — u2U3) 


x (y? + (uz +43 —2uy)y+uj+u3t+u3 — Wy U2 — 443 — u2uU3). 


In each factor, the terms of degree 0 in y are symmetric in 4), 42,43. So the coefficient 
of y is the crucial term. It follows that the first factor gives G = ((12)), the second 
gives G = ((13)), and the third gives G = ((23)). <> 


Although Example 13.4.4 is trivial from the point of view of Galois theory, it does 
show that Theorem 13.4.2 determines the Galois group only up to conjugacy. 

In the earlier part of this section we assumed that F was infinite so that we could find 
t\,...,ln € F such that the m! elements t}Q9(1) +--+ +fnQg(n) were distinct. In contrast, 
Theorem 13.4.2 applies to all fields, even finite ones. This works because 1,...,Un 
are variables, so that the expressions 41 01,(1) ++ +++ UnQg(n) are automatically distinct 
by the separability of f. 

We will soon see that applying Theorem 13.4.2 over a finite field has some nice 
consequences. 
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B. Dedekind’s Theorem. Given a polynomial f € Z[x] and a prime p, we let 
f € F,[x] be the polynomial obtained by reducing the coefficients of f modulo p. 
Then the following theorem of Dedekind shows how f can give information about 
the Galois group of f over Q. 


Theorem 13.4.5 Let f € Z[x] be monic and separable of degree n. Given a prime p 
such that p{ A(f), let 


f=hifoo fs 
where f, ...,f, € Fp[x] are monic and irreducible. Also set d; = deg(f,). Then: 
(a) The Galois group of f over F, is cyclic of order \cm(d\,d2,...,d,). 
(b) The Galois group of f over Q contains an element that acts on the roots of f 
according to a product of disjoint cycles of the form 


(0) (0 Jee... 
aed 


d\-cycle d2-cycle d,-cycle 
Hence the Galois group of f contains an element of order \cm(d,,d2,...,d,). 


Proof: First observe that f is separable, since pt A(f) and A(f) is the reduction of 
A(f) modulo p by Exercise 4 of Section 5.3. 
Part (a) is an easy application of Chapter 11. Since 


xP" y= Il (x-a@), 


a€F ym 


a separable polynomial in F,[x] splits completely over F,» if and only if it divides 
x?" — x. Thus: 


f splits completely over Fon <= f splits completely over Fp» for all i 
<=> f, divides x°" —x for all i 
<=> d; = deg(f,) divides m 
<=> Icm(d),do,...,d,) divides m, 


where the second equivalence uses our above observation and the third equivalence 
uses part (c) of Proposition 11.2.1. This easily implies that the splitting field of f 
over F, is Fz, d = lem(d),d2,...,d,). Since Gal(F,«/F,) is cyclic of order d by 
Theorem 11.1.7, part (a) follows. 

For later purposes, let us describe the action of Gal(F,«/F,) on the roots of f. 
By Theorem 11.1.7, the Galois group is generated by the Frobenius automorphism 
av a?. Since f, is irreducible, Exercise 7 of Section 11.1 implies that if a is a root 
of f,, then all roots are given by 


d;—1 = 

’ _ = deg(f;). 

Hence the action of a++ a? on the roots of f; is is sven by a d,-cycle. Since f is 
separable, it follows that a> a? acts on the roots of f according to a product of 
disjoint cycles of lengths d),...,d,, just as in part (b) of the theorem. 


2 
a,a?,a? ,...,a? 
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We turn to part (b). Consider the universal version of s,(y) defined by 


Su(y) = II (y- (u1X9(1) + +++ UnXg(n))) € Z[x1,... Xn Uy,--. ,Un,y). 
o€S, 


This is symmetric in x),...,X,, So that 
Suly) € Zloy,...,On,U1,--- Uns yl 


by the Fundamental Theorem of Symmetric Polynomials over Z (see Exercise 6 of 
Section 9.1). _ 
Write the polynomials f and f as 


f =x" =x") tee +(-1)"cn € Z{x], 


fsx" —Gx"'4---4(-1)"G, € FP [x]. 
This gives 


Suly) € Zluy,...,Un,y] obtained from S,(y) via a; ci, 
Su(y) € F,[ui,...,4n,y] obtained from S,(y) via a; &. 


Thus 5,,(y) is the reduction of s,(y) modulo p, since é; is the reduction of c; modulo p. 
We relate s,(y) and 5,(y) to the Galois groups of f and f as follows. As usual, the 
Galois group of f over Q maps to a subgroup Gy C S, that records the Galois action 
on the roots. Given an irreducible factor h of s,(y) over Q, Theorem 13.4.2 implies 
that Gy is conjugate to 
G={o€S,|o-h=h}. 


By Exercise 6 we may assume that A is an irreducible factor of s,(y) in the ring 
Z|u1,.-.,4n,y]. Reducing this modulo p gives h € F,[u1,..-,Un,y]- If g is an irre- 
ducible factor of h, then it is also an irreducible factor of 5,(y), so that by Theo- 
rem 13.4.2, the Galois group of f over F, gives a subgroup of S,, conjugate to 


G={o€S,|o-g=3}. 


We claim that G C G. To prove this, suppose that o -g = g but o-h=h, #h. 
Then o - su(y) =s.(y) implies that A, is also an irreducible factor of s,(y). Since 
Z[u,...,Un,¥] is a UFD, we must have 


Suly) = hhig 
for some polynomial g € Z|u),...,u,,y]. Reducing this modulo p gives 
5.(y) = hhyg € Fp [uy,...,Un,yI- 
Furthermore, the S,-action is compatible with reduction modulo p, so that hj = a0 -h 


implies h, =o -h. Since g divides h, we see that = o -g divides o -h = hy. By the 
above equation, this implies that g* divides 5,(y). Yet over the splitting field L of 
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Ff, (13.41) implies that §,(y) is a product of distinct irreducible factors. This easily 
implies that the same is true over F,. Hence we have a contradiction, which proves 
that o-h = h whenever 0 -g = g. ThusGC G. 

By part (a) the Galois group of f over F,, contains an element whose action on 
the roots of f is given by a product of disjoint cycles of lengths d,,...,d,. Since the 
conjugate of a product of disjoint cycles of lengths d),...,d, is a permutation of the 
same form, we see that G and hence G contain a permutation of the desired form. 
Since G is conjugate to Gy, the Galois group of f must contain an automorphism 
whose action on the roots is as described in part (b) of the theorem. rT 


Here is an example to illustrate part (b) of Theorem 13.4.5. 


Example 13.4.6 Consider f = x° + 20x+ 16 € Q[x]. In Exercise 7 you will verify 
that f is irreducible with discriminant A(f) = 2!°5®. This shows that the Galois 
group of f over Q is isomorphic to a subgroup of As. 

Working modulo 7, we have the irreducible factorization 


f= (x+2)(x+3)(3 42x? 45x45) in yf. 


Since 7{ A(f), part (b) of Theorem 13.4.5 implies that the order of the Galois group 
is divisible by 3. The classification of transitive subgroups of Ss given in (13.16) 
shows that As has no proper transitive subgroup of order divisible by 3. Hence the 
Galois group of f over Q is isomorphic to As. <p> 


Our next example uses the cycle decomposition of part (b) of Theorem 13.4.5. 


Example 13.4.7 In Section 6.4 we showed that f = x° — 6x+3 € Q has Ss as its 
Galois group over Q. If you look carefully at the argument given in Section 6.4, 
you'll see that we showed first that the image of the Galois group in Ss contains a 
5-cycle and a 2-cycle, and second that any 5-cycle and 2-cycle generate Ss. 

Using part (b) of Theorem 13.4.5, it is easy to get the required cycles. Consider 
the irreducible factorizations 


f=x4+4x43 in Fs[x], 
f= (x+2)(x+7)(x+ 13)(x? 412x413) in Fy7[x]. 


The first gives a 5-cycle and the second gives a 2-cycle. The theorem applies to these 
primes, since A(f) = — 1737531 = —3*- 19-1129. 

In Exercise 8 you will give a different proof that the Galois group is Ss by reducing 
modulo 11 and using the method of Example 13.4.6. <> 


The paper [26] discusses an approach to computing Galois groups that uses The- 
orem 13.4.5 more systematically. 


Mathematical Notes 


We will discuss three topics related to Theorem 13.4.5. 
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= Reduction Modulo p. Given a monic polynomial f € Z|x], Theorem 13.4.5 shows 
that its factorization modulo p gives interesting information about the Galois group 
of f over Q. The reduction is interesting for other reasons connected to what is 
known as class field theory. This is a large topic, so we will confine ourselves to two 
examples. 

The first concerns the case when f is the quadratic polynomial f = x? — a, where 
a€Z. Since A(f) = 4a, we know that f € F, is separable when p{4a. For such a 
prime p, f splits completely modulo p if and only if the congruence 


x =a mod p 


has an integer solution. When the latter holds, we say that a is a quadratic residue 
modulo p. Quadratic residues play an important role in number theory and are related 
to the Legendre symbol defined in the Mathematical Notes to Section 9.2. 

A deeper example is the following observation of Kronecker. The polynomial 


f = (x3 — 10x)? +31 (x? — 1)? € Z[x] 


has discriminant A(f) = —2°-3°.313. Now consider the following question: For 
which primes p does the factorization of f modulo p include a linear factor? In other 
words, when does f have a root modulo p? The amazing answer, due to Kronecker, 
is that if p > 3 is a prime different from 31, then 


f(x) =0 mod p for somex€ Z <=> p=x*+3ly’ for some x,y € Z. 


So our question characterizes primes of the form x? + 3ly*. Kronecker never pub- 
lished a proof of this result, which today is regarded as part of class field theory and 
complex multiplication. See [8] for an introduction to this rich subject. 


» The Chebotarev Density Theorem. Let f € Z|x] be monic and separable of 
degree n with splitting field Q C L. Given a prime p that does not divide A(f), 
Theorem 13.4.5 implies that Gal(L/Q) contains an element that corresponds to the 
Frobenius automorphism a ++ a? in the Galois group over F,. This element of 
Gal(L/Q) is called the Artin symbol of p, denoted op. Since the proof of Theo- 
rem 13.4.5 involves choices related to the ordering of the roots, the Artin symbol ap 
is well defined only up to conjugacy in Gal(L/Q). The Chebotarev Density Theorem 
describes the behavior of a, as we vary the prime p: 


e Up to conjugacy, every element of Gal(L/Q) equals o, for some prime p. 

e If we fix a conjugacy class C of Gal(L/Q), then the percentage of primes p whose 
Artin symbols a, lie in C is proportional to |C]. 

In the second bullet, the “‘percentage of primes” needs to be defined carefully. This 

and the Chebotarev Density Theorem are discussed in §8.B of [8]. See also [37]. 
We can reformulate this in terms of Gal(L/Q) ~ GC S, as follows. A permutation 

Tt € S, has cycle type d,...,d,, where d; <---<d, and d,+---+d,=n,if 7 isa 

product of disjoint cycles (including 1-cycles) of lengths d,,...,d,. In Exercise 9 
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you will prove that two elements of S,, are conjugate if and only if they have the same 
cycle type. For a fixed cycle type d),...,d,, the set 


(13.44) {o €G|o has cycle type d,,...,d,} 


is a union of conjugacy classes in G (see Exercise 10). Hence, if we fix the cycle 
type d),...,d, of an element of G, then the Chebotarev Density Theorem implies the 
following: 


e There is some prime p for which the irreducible factors of f modulo p have 
degrees d,...,d,. 

e The percentage of primes for which the irreducible factors of f modulo p have 
these degrees is proportional to the number of elements of G with this cycle type. 


Here is an example of what this looks like in practice. 


Example 13.4.8 Consider f = x4 —7x? + 19x? —23x+ 11 € Z[x], which has A(f) = 
53. For the 200 primes 7 < p < 1237, it is straightforward to compute the degrees 
of the irreducible factors of f modulo p using Mathematica or Maple. When we 
tabulate the resulting degree patterns and the percentage of primes corresponding to 
each pattern, we obtain: 


irreducible factors | 4 of degree 1 | 2 of degree 2 | 1 of degree 4 


(13.45) percentage of primes 25% 23% 52% 


The last column shows that f remains irreducible for some primes, so that the Galois 
group of f contains a 4-cycle by Theorem 13.4.5. In S4, a 4-cycle (abcd) generates 
the subgroup 


((abed)) = {e = (a)(b)(c)(d), (ac)(bd), (abed), (dcba)}. 
For such a subgroup, the distribution of cycle types is: 


cycle type 1,1,1,1 | 2,2 4 


(13.46) percentage of elements | 25% | 25% | 50% 


By the Chebotarev Density Theorem, the close match with (13.45) strongly suggests 
that the Galois group of f is cyclic of order 4. However, there could be a large prime 
whose degree pattern is not in (13.45). Hence this is not a rigorous computation of 
the Galois group. <> 


The papers [16] and [26] discuss this method for computing Galois groups. 


= Bad Reduction Modulop. When f € Z[x] is monic and separable, Theorem 13.4.5 
gives information about the Galois group of f by reducing f modulo primes p such 
that p{ A(f). If instead p|A(f), then f is not separable and our arguments fail. 
When this happens, more advanced methods using the decomposition group and 
the inertia group can still provide useful information about the Galois group. For 
example, in Section 6.4 we mentioned that f = x" — x — 1 has Galois group S,, over Q 
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when n > 2. As explained on page 42 of reference [4] to Chapter 6, this is proved by 
reducing modulo the primes dividing the discriminant. 


Exercises for Section 13.4 


Exercise 1. Verify the computations given in Example 13.4.1. 
Exercise 2. Prove that the polynomial s,(y) defined in (13.41) lies in K[u1,..., 4n,y)- 


Exercise 3. This exercise is concerned with the proof of Theorem 13.4.2. 
(a) Let Z,,...,8n €L. Prove that y+ >>;_, Giu:isirreducible in L[u,...,un,y]. (This implies 
that (13.41) is the irreducible factorization of s,(y) in L[u,...,n,y]-) 
(b) Letg,h€ F[u),...,4n,y], and assume that in the larger ring L[u,...,un,y] we have h = gg 
for some g € L[m,...,4n,y|. Prove that g € F[u1,...,un,y). 
(c) In the final part of the proof of Theorem 13.4.2, we showed that GC o~'Gyo. Prove the 
opposite inclusion. 


Exercise 4, Consider the polynomial s,(y) when f = x° + x? — 2x —1 from Example 13.4.3. 
(a) Compute s,(y) € Q[u1,u2,43,y], and derive the factorization given in Example 13.4.3. 
(b) Let A be the first factor of s,(y) given in Example 13.4.3, multiplied by —1 so that it is 

monic in y. Using SymmetricReduction in Mathematica or NormalForm in Maple as 
in Section 2.3, write 4 as a polynomial in y so that its coefficients are of the form 


a symmetric polynomial in u),u2,43 + a remainder in u;,u2,u3. 
This should give the formula for A given in Example 13.4.3. 


Exercise 5. Use the method of part (a) of Exercise 4 to derive the factorization of s,,(y) given 
in Example 13.4.4. 


Exercise 6. As in the proof of Theorem 13.4.5, suppose that we have s,(y) € Z[u1,..., un, 
and h € Q[u1,...,&n,y| is an irreducible factor of s,(y) when s,(y) is regarded as an element 
of Q[u1,...,4n,y]. In this exercise we will study how close h is to being an irreducible factor 
of su(y) in Z[u1,...,Un,y). 

(a) We know that the rings Z[x),...,x,] and Q[m,...,x%,] are both UFDs. Prove that if 
f € Z[x,,..., xn] is irreducible and nonconstant, then it is also irreducible when regarded 
as an element of Q[x1,...,n]. 

(b) Prove that if s,(y) and A are as above, then / is a Q-multiple of an irreducible factor of 
Su(y) in Z[u1,...,un,y]- 


Exercise 7. Let f = x° + 20x + 16 € Q[x] be the polynomial of Example 13.4.6. Show that f 
is irreducible over Q, and compute its discriminant and irreducible factorization modulo 7. 


Exercise 8. Compute the Galois group of f = x° — 6x+3 over Q using reduction modulo 11 
and the method of Example 13.4.6. 


Exercise 9. Prove that two permutations in 5, are conjugate if and only if they have the same 
cycle type. 


Exercise 10. Let G be a subgroup of S,. For a fixed cycle type di,...,d-, consider the set 
(13.44) of all elements of G with this cycle type. 
(a) Prove that this set is either empty or a union of conjugacy classes of G. 
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(b) Give an example where the set is empty, and give another example where it is a union of 


two conjugacy classes of G. 


Exercise 11. This exercise will explore the ideas introduced in Example 13.4.8. 
(a) For each transitive subgroup of S4, make a table similar to (13.46) that lists the number 


of elements of each possible cycle type for that subgroup. 


(b) For each polynomial in Exercise 14 of Section 13.1, compute its factorization modulo 


200 primes, and record your results in a table similar to (13.45). Use this to guess the 
Galois group of each polynomial. 
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CHAPTER 14 


SOLVABLE PERMUTATION GROUPS 


This chapter will study solvability by radicals for irreducible polynomials of degree 
p or p’, where p is prime. These results go back to Galois and illustrate his amazing 
insight into group theory. We will also discover why Galois invented finite fields. 

While Galois’s result for degree p is relatively easy to prove, understanding the 
case of degree p* requires the theory of permutation groups (subgroups of S,). 
We will see that in degree p’, irreducible polynomials can be either primitive or 
imprimitive. The case of solvable imprimitive subgroups of S,2 will be considered 
first, followed by the more complicated case of solvable primitive subgroups. The 
proofs will involve surprising amounts of group theory. 


14.1 POLYNOMIALS OF PRIME DEGREE 


The goal of this section is to prove the following wonderful theorem of Galois. 


Theorem 14.1.1 Let F be a field of characteristic 0, and let f € F(x| be irreducible 
of prime degree p. Then the following are equivalent: 

(a) f is solvable by radicals over F. 

(b) For every pair of roots a # 8 of f, F(a, 8) is the splitting field of f over F. 
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(c) For some pair of roots a # B of f, F(a, B) is the splitting field of f over F. 
(d) The Galois group of f over F is isomorphic to a subgroup of AGL(1,F,). 


The proof will be given later in the section. Recall from Section 6.4 that the group 
AGL(1,F,) consists of all functions 


Yap(4) =au+b, acky, beF,. 


If we identify the congruence classes [1], [2],...,[p] € F, with the numbers 1,2,...,p, 
then AGL(1,F,,) becomes a subgroup of S,. Thus 


AGL(I,F,) C Sp 


is a subgroup of order p(p — 1), and an element y, , € AGL(1,IF,) is the permutation 


{1 2 wee p _ i 
Yab~\a4+b 2at+b --- patb)~ \aitb)’ 


where we interpret everything modulo p. In particular, let 9 = y, ,. Then we have 
the p-cycle 


i 
§= @ :) = (12...p) € AGL(I,F,). 
Here are two useful facts about @ and AGL(1,F,,). 


Lemma 14.1.2 
(a) AGL(1,F,) is the normalizer of (6) in S,. 
(b) Ifr €S, satisfies r6r—' € AGL(1,F,), then r € AGL(1,F,). 


Proof: The normalizer of (9) in S, consists of all 7 € S, such that r(9)r—! = (6). 
In Exercise | you will show that 7 lies in the normalizer if and only if 


76=6'r forsome 1 <@<p-l. 
Since 6£(i) = i+ @, the above equation is equivalent to the identity 
(14.1) r(it+l)=r(i)+é, i=1,...,p, 
where as usual we interpret everything modulo p. This implies that 
T(i+2) =7((i+ 1) 41) =7G41)4+0=7) +04 l=7(i) +28, 
and more generally, one easily proves that for any positive j, 
ri+fj)=r(i)+jé, i=1,...,p 
(see Exercise 1). Then setting i = p gives 


Tj) =7(p+ J) =7(P) + 6 = Ye rp): 
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This shows that 7 = y,,(,, € AGL(I,F,). Conversely, it is easy to see that any 
a,b © AGL(1,F,) satisfies (14.1) with € = a. This proves part (a) of the lemma. 

For part (b), first observe that (@) is a p-Sylow subgroup of AGL(1,F,), since 
|AGL(1,F,)| = p(p — 1). Furthermore, it is unique by the Second Sylow Theorem 
(see Theorem A.5.1), since (8) is normal in AGL(1,F,) by part (a). 

Now assume that 7 € S, and 767—' € AGL(1,F,). Then (767~') is a also p- 
Sylow subgroup of AGL(1,F,). By uniqueness, (9) = (r67~!') = 7(8)7—'. Thus 7 
normalizes (@) and hence lies in AGL(1,F,) by part (a). 7 


We will use the following lemma several times. See Exercise 2 for a proof. 


Lemma 14.1.3 Suppose that H is a normal subgroup of a finite group G and let 
g € G. If the order of g is relatively prime to |G: H], then g € H. a 


We can now characterize solvable transitive subgroups of S,. 


Proposition 14.1.4 Every solvable transitive subgroup G C S, is conjugate to a 
subgroup of AGL(1,F,) containing 0. 


Proof: Since G is transitive, the orbit of any i € {1,...,p} isall of {1,...,p}. Thus 
p divides the order of G by the Fundamental Theorem of Group Actions. Hence 
G contains an element of order p, by Cauchy’s Theorem. This element must be a 
p-cycle, since we are in S, (the order of a permutation is the least common multiple 
of its cycle lengths). By Exercise 9 of Section 13.4, any p-cycle in S, is conjugate 
to @. Replacing G with a suitable conjugate, we may assume that 6 € G. 

Since G is solvable, we can find subgroups 


{e}=GoCG,C---CG,1CG,=G 


such that Ge_, is normal in Ge and [Ge : Ge_,| is prime for @ = 1,...,n. 

Let i be the smallest index such that @ € G;. Note that i > 0. 

We first claim that [G;: G;_,] = p. To see why, suppose that [G; : G;_1] = g, where 
q # pis prime. Since 6 € G; has order p, Lemma 14.1.3 implies that 9 € G;_-1, which 
contradicts the definition of i. Hence [G;: G;_1| = p. 

We next claim that i= 1. If i > 1, then there is 7 € G;_, such that 7(j) =k for 
some j # k mod p. Then 6/—* maps k to j, so that p = 76/—* € G; fixes k. Thus p 
is a product of disjoint cycles of lengths < p. Since p is prime, it follows that the 
order of p is relatively prime to p = [G;:G;_1]. Hence p € G;_-, by Lemma 14.1.3, 
and then 6/—* € G;_, follows from 7 € G;_,. Since j # k mod p, this implies that 
6 € G;-, which contradicts the definition of i. Hence i= I. 

Since Go = {e}, these claims imply that G, has order p and contains 8. Since 6 
has order p, we conclude that G, = (6). It follows that G; C AGL(1,F,). 

Now let | < j <n be the largest index such that G; C AGL(1,F,). Suppose that 
j <n, and take 7 € Gj41. Then 6 € G, C G; implies that 767~! € G;, since G; is 
normal in G;+1. This gives 797~!' € AGL(1,F,), sothatr € AGL(1, F,) by part (b) of 
Lemma 14.1.2. Since 7 € Gj4, was arbitrary, we conclude that Gj,, C AGL(1,F,). 
This contradicts the definition of 7. Hence we must have j = n, which gives the 
desired inclusion G = G, C AGL(1,F,). a 
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We now have the tools needed to prove Galois’s theorem. 


Proof of Theorem 14.1.1: Let f have roots a1,...,a@, in a splitting field L. Then 
Gal(L/F) is isomorphic to G C Sp, where o € Gal(L/F) maps to r € G such that 
a(a;) = a). By Proposition 6.3.7, G is transitive, since f is irreducible. 

First consider (a) + (d). If f is solvable by radicals over F, then G is transitive (by 
the above) and solvable (by Theorem 8.5.3). Using Proposition 14.1.4, we conclude 
that G is conjugate to a subgroup of AGL(1,F,). This proves (a) => (d). For the 
converse, note that AGL(1,F,) is solvable by Example 8.1.6, and then any subgroup 
is also solvable by Proposition 8.1.3. Thus the Galois group of f over F is solvable, 
so that f is solvable by radicals over F by Theorem 8.5.3. 

We next prove (b) => (c) => (d) => (b). The first implication is obvious. For the 
second, observe that in the proof of Proposition 14.1.4, the first paragraph applies 
to any transitive subgroup of S,. Thus we may assume that (0) C G. Now suppose 
that L = F(a;,q,) is the splitting field of f over F for some i 4 j. This gives the 
extensions 

F C F(a;) C F(aj,a;) =L, 


where the first has degree p, since f is irreducible over F’, and the second has degree 
at most p— I, since a; is a root of f/(x —a;) € F(a;)[x]. By the Tower Theorem, 


(14.2) |G| = |Gal(L/F)| =[L:F]=pm, 1<m<p-l. 


Since p is prime, it follows that (6) is a p-Sylow subgroup of G. According to the 
Third Sylow Theorem (see Theorem A.5.1), the number of p-Sylow subgroups of G 
divides |G| and is congruent to 1 modulo p. In Exercise 3 you will use this and (14.2) 
to show that (6) is the unique p-Sylow subgroup of G and hence is normal in G. It 
follows that G is contained in the normalizer of (8) in S,. By Lemma 14.1.2, the 
normalizer is AGL(1,F,), so thatG C AGL(1,F,). Thus Gal(L/F) is isomorphic to 
a subgroup of AGL(1,F,). 

For (d) = (b), relabel the roots of f so thatG C AGL(1,F,). We need to show that 
F (q;,c;) is the splitting field of f over F for any i ¢ j. By the Galois correspondence, 
it suffices to show that the only element o € Gal(L/F) fixing F(a,,,) is the identity. 
Since o corresponds to t € G and GC AGL(1,F,), we see that r = y, , for some 
acé Ky andbecF,. Then 


a(aj)=Q; > a,j) =A > Air, = Oi, 


a(aj)=Q; > O7() =A; > Aajry = Oj. 


This gives the equations ai + b = i and aj +b = j, which modulo p have the unique 
solution a = 1,b =0, since i # j modulo p. Thus 7 = +, g is the identity, so that a is 
the identity in Gal(L/F). Hence F(a;,a,;) is the splitting field. 7 


We first encountered the affine linear group AGL(1,F,) as the Galois group of 
Qc QC, 4/2) in Section 6.4. This extension is the splitting field of x? — 2 € Q[x], 
which is obviously solvable by radicals. Hence what we did in Section 6.4 is a perfect 
illustration of Theorem 14.1.1. 
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Mathematical Notes 
The proof of Theorem 14.1.1 uses the following concept from group theory. 


« Frobenius Groups. We showed above that F (a;;,a;) is the splitting field by arguing 
that the identity is the only element of AGL(1,F,,) that fixes i and j. This generalizes 
as follows. If a finite group G acts transitively on a set X such that 1 < [X| < |G, 
and for every x # y in X the identity is the only element of G fixing x and y, then we 
say that G is a Frobenius group. When this happens, the isotropy subgroup G, of any 
x €X is called a Frobenius complement. A discussion of Frobenius groups can be 
found in [3, Sec. 3.4] and [14, p. 90]. See also Exercise 4. 


Historical Notes 


Galois considered Theorem 14.1.1 to be one the best applications of his theory. 
His version of the theorem is as follows [Galois, p. 69]: 


PROPOSITION VI 


THEOREM. In order that an irreducible equation of prime degree be solv- 
able by radicals, it is necessary and sufficient that when any two of the roots are 
known, the others can be deduced from them rationally. 


If we are working over a field F, then “deduced from them rationally” means that 
the other roots are rational functions with coefficients in F in the known roots a, 8. 
This implies that F(a, 3) is the splitting field. Thus Galois’s theorem is (a) + (b) 
of Theorem 14.1.1. Galois especially liked this result because its statement doesn’t 
mention Galois theory, yet the Galois group is crucial to the proof. 

As for part (d) of Theorem 14.1.1, Galois says the following [Galois, p. 67]: 


Therefore, “if an irreducible equation of prime degree is solvable by radicals, 
then the group of the equation contains only substitutions of the form 


Xt Xak+b 
a and b being constants.” 


Reciprocally, if this holds then I say that the equation will be solvable by 
radicals. 


This is (a) + (d) of Theorem 14.1.1. Galois denotes the roots of the polynomial as 
Xo,--+;Xn—1, Where n is prime (his 7 is our p). Furthermore, on page 65, Galois says 
“We set in general x, = X9,%n4+1 =X1,---” Thus Galois treats the indices modulo n 
just as we treat them modulo p. 

Galois published his Proposition VIII separately in 1830, before he had worked 
out his general theory of solvability. It is possible that thinking about this case 
(and the group AGL(1,F,) in particular) led Galois to the general idea of normal 
subgroups and solvability. The irony is that by focusing attention on the special case 
of Theorem 14.1.1, Galois distracted his contemporaries from the real depth of his 
innovations. 
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Galois also formulated solvability by radicals in terms of the resolvents discussed 
in Chapters 12 and 13. Let f be irreducible of degree p. Since AGL(1,F,) has index 
(p—2)! in S,, the theory of Section 13.3 constructs a resolvent O(y) € Fly] such 
that if the Galois group of f is isomorphic to a subgroup of AGL(1,F,), then O,(y) 
has a root in F, and the converse is true provided that the root is simple. Because of 
this, Galois asserts that to check solvability by radicals, 


it suffices to know whether or not this auxiliary equation [our O;(y)] of degree 
1.2.3...(m—2) has a rational root. 


(See [Galois, p. 69].) Here, Galois’s n is our p, and “rational root” means root in F. 
Although Galois (like Jordan) missed the importance of simple roots, this result 
can be regarded as the generalization of Corollary 13.2.11 for an arbitrary prime. 
Galois also knew how to build the resolvent Oy(y) using the methods discussed in 
Section 12.1. For example, the polynomial (12.20) appears in Galois’s memoir. 

In a letter written to Crelle in 1828, Abel stated a version of (a)  (b) © (c) from 
Theorem 14.1.1 (see [Abel, Vol. II, p. 270]). Also, an earlier letter to Crelle gave 
formulas for the roots of a solvable polynomial of degree 5 and claimed that similar 
results apply in degrees 7, 11, 13, etc. Unfortunately, no details of his proofs are 
known. Kronecker fleshed out Abel’s ideas in an 1853 paper that is discussed in [5]. 


Exercises for Section 14.1 


Exercise 1. This exercise is concerned with the proof of part (a) of Lemma 14.1.2. Let 

6 = (12...p) € Sp. 

(a) Prove that 7 € S, lies in the normalizer of (6) if and only if 70 = 6°7 for some 1 < 2 < 
p-l. 

(b) Prove that (14.1) implies that 7(i+ /) = 7(i)+ jé for all positive integers j. 


Exercise 2. Let H be a normal subgroup of a finite group G and let g € G. The goal of this 
exercise is to prove Lemma 14.1.3. 

(a) Explain why (gH)”*) = (gH)':4) = H in the quotient group G/H. 

(b) Now assume that ged(o(g),([G:H]) = 1. Prove that g € H. 


Exercise 3. Let G satisfy (14.2). Use (14.2) and the Third Sylow Theorem to prove that G has 
a unique p-Sylow subgroup H of order p. Then conclude that H is normal in G. 


Exercise 4. The definition of Frobenius group given in the Mathematical Notes involves a 
group G acting transitively on a set X. Prove that a group G is a Frobenius group if and only 
if G has a subgroup H such that 1 < |H| < |G| and HN gHg™' = {e} forall g ¢ H. 


Exercise 5. Let F be a subfield of the real numbers, and let f € F |x] be irreducible of prime 
degree p > 2. Assume that f is solvable by radicals. Prove that f has either a single real root 
or p real roots. This was proved by Kronecker in 1856 using methods due to Abel (see [15]). 


Exercise 6. By Example 8.5.5, f = x° — 6x+3 is not solvable by radicals over Q. Give a new 
proof of this fact using the previous exercise together with the irreducibility of f and part (b) 
of Exercise 6 from Section 6.4. 


Exercise 7. Use Lemma 14.1.3 and part (a) of Lemma 14.1.2 to give a proof of part (b) of 
Lemma 14.1.2 that doesn’t use the Sylow Theorems. 
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Exercise 8. Let f € F [x] be irreducible of prime degree p > 5, where F has characteristic 0, 

and let a # { be roots of f in some splitting field. If F(a, 8) contains all other roots of f, 

then f is solvable by radicals by Theorem 14.1.1. But suppose that there is some third root ‘y 

such that +y € F(a, 3). Is this enough to force f to be solvable by radicals? 

(a) Use the classification of transitive subgroups of Ss from Section 13.2 to show that the 
answer is “yes” when p = 5. 

(b) Use the polynomial x’ — 154x+ 99 from Example 13.3.10 to show that the answer is “no” 
when p = 7. 


14.2. IMPRIMITIVE POLYNOMIALS OF PRIME-SQUARED DEGREE 


Having studied polynomials of prime degree p, we turn our attention to polynomials 
of degree p*. In this section, we will see that such polynomials can be either primitive 
or imprimitive. Our main result (Theorem 14.2.15) will describe the Galois group of 
an irreducible imprimitive polynomial of degree p* that is solvable by radicals. The 
primitive case will be considered in Sections 14.3 and 14.4. 

The proof of Theorem 14.2.15 will require that we study permutation groups, 
i.e., subgroups of S,. After defining primitive and imprimitive permutation groups, 
we will concentrate on the imprimitive case and use wreath products to classify all 
solvable transitive imprimitive subgroups of S,2. Primitive permutation groups will 
be considered in Section 14.3. 


A. Primitive and Imprimitive Groups. By Section 6.3, the Galois group of a 
separable polynomial f of degree n gives a permutation group G C S, that records the 
Galois action on the roots of f. An important idea in Galois theory is that properties 
of f should be reflected in the properties of G. For example, Proposition 6.3.7 says 
that f is irreducible over F if and only if G is transitive. Thus “transitive” is the 
permutation group analog of “irreducible” for polynomials. 

We next consider the concepts of imprimitive and primitive, which apply to both 
polynomials and permutation groups. We will begin with the former, where the idea 
is that separable polynomials come in two flavors, according to whether or not the 
roots break up into “blocks” under the action of the Galois group. 

Before giving the general definition, let us consider an example. 


Example 14.2.1 The polynomial f = x* — 2 € Q|x] is separable with roots 
V¥2,-V2 and iv/2,-iV2. 


We have written the roots in two blocks that have the following nice property: If 
we apply o € Gal(Q(i, W2)/Q) to the first block of roots, then the result is either 
the first block or the second block, and the same is true if we apply o to the second 
block. This follows because o(—a) = —o(a@). Hence the action of the Galois group 
respects the block structure of the roots. <P 


This leads to the following general definition. 
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Definition 14.2.2 Let f € F |x] be a separable polynomial with splitting field L. 
(a) f is imprimitive if the set of roots of f can be written as a disjoint union 


R,U---UR,, R; #9 for alli, 


such that for every o € Gal(L/F) and 1 <i < k, we have o(R;) = Rj for some 
1<j<k. Wealso require thatk > 1 and |R;| > 1 for some i. 
(b) f is primitive if it is not imprimitive. 


In the definition of imprimitive, the R; are the blocks, and o(R;) = R; means that 
the Galois group preserves the block structure of the roots. The requirements that 
k > | and some |R;| > 1 exclude the trivial block structures where there is only one 
block or where every block consists of a single root. 

When the polynomial is also irreducible, we get some useful information about 
the size of the blocks in the imprimitive case. 


Lemma 14.2.3 Let f € F|x] be irreducible and separable of degree n. Assume that 
f is imprimitive with roots R,U---UR, as in Definition 14.2.2. Then every R; has 
the same number of elements, say |. Thus n = kl, where k > 1 andl > 1. 


Proof: Given blocks R; and R; with i 4 j, pick a € R; and 6 € Rj. Since f is 
irreducible and L is its splitting field over F, Gal(L/F) acts transitively on the roots. 
Thus there is 0 € Gal(L/F) such that o(a) = 8. Since f is imprimitive, we have 
o(Rj) = R;, so that |R;| = |R;|, since o is one-to-one. If / = |R;|, then 


n=|Ri|+---+|Re| =A, 
since f is separable. Then k > | and / > 1 follow from Definition 14.2.2. 7 
Here are some easy examples. 


Example 14.2.4 If f is irreducible and separable of prime degree p, then f cannot 
be imprimitive, since it is impossible to write p = ki with k > 1 and/ > 1. Hence 
irreducible separable polynomials of prime degree are automatically primitive. 
However, if f is irreducible and separable of degree p”, then f can be either 
primitive or imprimitive. In the latter case, we must have p blocks, each consisting 
of p roots. When p = 2, we saw an instance of this in Example 14.2.1. << 


We translate these concepts into group theory as follows. 


Definition 14.2.5 Let G be a subgroup of S,. Then: 
(a) G is imprimitive if there is a disjoint union 


{1l,...,.2} =RiU---UR;, R: #9 foralli, 


such that for every tT € G and every 1 <i<k, we have r(R;) = R; for some 
1<j<k. Wealso require thatk > 1 and |Rj| > 1 for some i. 
(b) G is primitive if it is not imprimitive. 
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Here is an example of an imprimitive permutation group. 


Example 14.2.6 The subgroup G = ((1324), (34)) C S4 is imprimitive via the blocks 
R; = {1,2} and R2 = {3,4}. This follows because (1324) maps each block to the 
other while (34) takes each block to itself. 

If we label the roots of x4 — 2 as ay = W2,a2 = —W2, 03 =iV2,04 = —iV2, then 
Gal(Q(i, W2)/Q) ~ GC S4. Do you see how the above blocks relate to those used 
in Example 14.2.1? <p 


Lemma 14.2.3 has the following group-theoretic analog. 


Lemma 14.2.7 Let G be a transitive subgroup of S,. Assume that G is imprimitive 
with blocks R,,...,R, as in Definition 14.2.5. Then every R; has the same number of 
elements, say 1. Thus n= kl, wherek > l andi > 1. a 


We omit the proof because it is identical to the proof of Lemma 14.2.3. Be sure 
you understand this. 


B. Wreath Products. Let f € F[x] be separable, irreducible, and imprimitive of 
degree n. How does being imprimitive restrict the Galois group of f? As we will 
see, the answer involves the concept of wreath product. 

By what we’ve done so far, our question reduces to the study of transitive imprim- 
itive subgroups G C S,. By Lemma 14.2.3, we have k > | blocks, each consisting 
of | > 1 elements, where n = kl. To begin our analysis, we will regard S,, = Sy as 
permutations of the product {1,...,k} x {1,...,/} and use the blocks 


(14.3) {1,...,kA} x {1,...,0}=R1U---UR,, Ri = {i} x {1,..., 0}. 
Then consider 
(14.4) S,2S; = {o E Sy | there is 7 € S, with o(R;) = RW) forl<i< kh. 


It is easy to see that S, 2S) is an imprimitive subgroup of S,; = S, with respect to the 
blocks R1,...,R,. We call S,2S; the wreath product of S; with S;. 

We can describe an element o € S;,2.S; as follows. Since R; = {i} x {1,...,/} and 
o(Ri) = Ry, there is a unique yz; € S; such that for all (i, j) € R;, we have 


o(i,j) = (7, wid) € Req. 
Thus jy; describes how o maps R; to R,,;). If we write o = (7; 441,.-- , x), then 
(14.5) SUS, = {75 1, --- ye) | 7 ESky bays tee € Sy}. 


In more concrete terms, think of a dresser with k drawers and / items in each drawer. 
Then elements of the wreath product (14.5) permute the items in the dresser by 
permuting the drawers via 7 and permuting the items in each drawer via the j;. 

Using (14.5), we can describe some interesting subgroups of S,?S;. Given sub- 
groups A C S; and B C S), define the set 


AtB= {(r3h1,---> He) | T EA, py. be € Bh. 
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This is a subgroup with the following properties. 


Lemma 14.2.8 Let A C S;, and BC S; be subgroups. Then: 

(a) AUB is a subgroup of SiS). 

(b) The map (7; 141,--., px) + 7 defines a group homomorphism A.B — A that is 
onto and whose kernel is isomorphic to B‘ = B x --- x B (k times). 


Proof: Given o = (T3,41,...,p4%) € AUB, we first show that a~! € A? B. Since o 
maps R; to R,;) via j4j, it is clear that o~' maps Rj) to R; via pu; |. If we set j = T(i), 
then i = 7—!(j), so that o—' maps Rj to R,-1(;) via Me: It follows that 

-1 -1.,-1 =I 
(14.6) oc = (7 7 NESEY TIN E 
This obviously lies in A? B. In Exercise | you will show that 
(14.7) (75 eee Me) (75 My yee Mg) = (775 een) Mh ye Marr ( ey Ma) 


Hence A 2B is closed under multiplication, and part (a) follows. 

It remains to prove part (b) of the lemma. The multiplication formula (14.7) shows 
that (7; 41,..-,4¢) + 7 is a homomorphism, which is clearly onto by the definition 
of A? B. Furthermore, its kernel is clearly the set 


{ (65 bays bk) | Mis +++ sb © BY. 
Then (14.7) shows that the obvious map to Bé is a group isomorphism. 7 
The subgroups S; 2S; C Sy = S, have the following important property. 


Proposition 14.2.9 Every transitive imprimitive subgroup of S, is conjugate to a 
subgroup of S,¢S, for some nontrivial factorization n = kl. 


Proof: LetGC S, be transitive and imprimitive. By Lemma 14.2.7, we have blocks 
R\U---UR,, 


where each R’ has / elements and every o € G maps R’ to some Ri. To compare this 
to (14.3), pick 7 € Sj with the property that 7(R;) = Rj for 1 <i<k. Such ar exists 
because |R;| = |R;|. One easily checks that if o € G maps R; to R’, then 7~lor maps 
R; to R;. It follows that T~'Gr C Sx 2S). a 


Proposition 14.2.9 implies the following result about the Galois group of an 
imprimitive polynomial. 


Corollary 14.2.10 Let f € F{x] be separable, irreducible, and imprimitive of degree 
n. Then n has a nontrivial factorization n = kl such that the Galois group of f over 
F is isomorphic to a subgroup of SxS}. a 


Here is an example of this result in degree 6. 
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Example 14.2.11 Suppose that f = x° + bx4 + cx? +d € F[x] is irreducible and 
separable over a field of characteristic # 2. It is easy to see that this polynomial is 
imprimitive, for if a is a root, then so is —a. Hence the roots can be partitioned into 
the three blocks 


R,;={a,-a}, R2,={8,-8}, R3={7,-7} 


that are obviously permuted by the Galois group. Thus the Galois group of f over 
F is isomorphic to a subgroup of $3252 C Sg. This group has order 6 - 23 = 48. For 
x® — 4x? + 1 € Q|x], the galois command of Maple shows that its Galois group over 
Q has order 48. Hence the Galois group is isomorphic to $3? S2. 

For x® — 4x? — 1 € Ql], the discriminant is 22297. Thus its Galois group over Q 
is isomorphic to a subgroup of (S32.S2) MA6, which has order 24 (you will verify this 
in Exercise 2). The galois command shows that the Galois group has order 24, so 
that the Galois group is isomorphic to ($3252) MAg¢ in this case. See Exercise 3 for 
more on the structure of these groups. <p> 


When n = p? and p is prime, the only nontrivial factorization of n is n = p- p. 
This gives the following corollary of Proposition 14.2.9 that will be useful later in 
the section. 


Corollary 14.2.12 Every transitive imprimitive subgroup of S,2 is conjugate to a 
subgroup of S,USp. rT 


For an irreducible imprimitive polynomial of degree p?, it follows that the Galois 
group is isomorphic to a subgroup of S,?S,. By (14.5), the order of this group is 


[Sp2Sp| = (p!)Pt!. 


This may seem like a large number, but it is actually quite small in comparison with 
|S,2| = (p?)!. Here is an example. 


Example 14.2.13 When p = 17, we have 
|S;2| = 289! = 2.1 x 1057, while |8,72S,7| = (17!)'8 = 8.3 x 10761. 
So |S,7| is much bigger than |$)72.5)7]. <p> 
We conclude by determining the structure of S> 2S». 


Example 14.2.14 The order of $2 ?S2 is (2!)? = 8. To figure out which group of 
order 8 this is, recall from Example 14.2.6 that ((1324),(34)) C S4 is imprimitive. 
This has order 8 and by Corollary 14.2.12 is conjugate to a subgroup of S22S. It 
follows that 

Sy US2 ~ ((1324), (34)). 


In particular, Sj? is a dihedral group of order 8. <p> 
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C. The Solvabie Case. We now have all of the tools needed to classify solvable 
imprimitive subgroups of $,2.. The key player is the solvable permutation group 
AGL(1,F,) C Sp. Using Lemma 14.2.8, we obtain the wreath product 


AGL(1,F,)(AGL(1,F,) C SptSp C Spr. 
This allows us to describe all transitive imprimitive solvable groups of S,2. 


Theorem 14.2.15 Let G be a transitive subgroup of S,2. Then the following are 
equivalent: 

(a) G is solvable and imprimitive. 

(b) G is conjugate to a subgroup of AGL(1,F,)tAGL(1,F,). 


Proof: Since AGL(1,F,) is solvable, AGL(1,F,)?AGL(1,F,) is also solvable 
by Exercise 4. Then (b) = (a) follows easily, since every subgroup of 5,25, is 
imprimitive and every subgroup of AGL(1,F,) ?AGL(1,F,) is solvable. 

We now consider (a) => (b). Let GC S, be transitive, solvable, and imprimitive. 
By Corollary 14.2.12, we may assume that GC S,?S,. Let G’ be the image of 
G under the homomorphism S, 2S, — S, of part (b) of Lemma 14.2.8. We claim 
that G’ is transitive. To prove this, take any i and j, and pick u € R; and v € Rj. 
Since G is transitive, there is o = (7;{41,...,4p) € G such that o(v) =v. Then 
o(R;) = Rj, which implies that 7(i) = j. Hence G’ C S, is transitive. Since G’ is 
solvable by Theorem 8.1.4, Proposition 14.1.4 implies that 6G’5~! C AGL(1,F,) for 
some 6 € S,. It follows that after conjugating G by (d;e,...,e) € S,2Sp, we have 
G’ C AGL(1,F,). Thus 


(14.8) GCAGL(I,F,) Sp C SptSp, 


and we are halfway done with the proof. 
Now fix i between | and p, and consider the group 


Gi = {0 € G| o(R;) = Ri}. 
In Exercise 5 you will show that the map G; — S, defined by 
(14.9) (73 hi, -+- lp) > bi 


is a group homomorphism. By Exercise 5 the image G; C S, of this map is transitive 
and solvable. Then Proposition 14.1.4 implies that there is 6; € S, such that d= 
(12...p) € 6;G/6,' C AGL(I,F,). Hence, after we conjugate G by (e;51,...,5p)s 
we may assume that 


(14.10) G;C AGL(I,F,) and 60€G; 


for all i. Notice that (14.8) continues to hold. 
Now let o = (7; ,41,-.., 4p) € G be arbitrary, and fix j between 1 and p. We 
will prove that 4; € AGL(1,F,) as follows. By (14.10) with i= 7(j), we can find 
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(p;11,..-;Y%p) € G such that p(?) = i and 4; = 6. Using (14.6) and (14.7), we obtain 
the element 


(14.11) y=a'(p,M,...,Up)o = (77! prsA1,---5Ap) EG, 
where A; = yOu j (see Exercise 5). Since 

Ter) =T p= T=), 
we see that 7 € G,, so that by (14.9), 

dj = Hj On; € Gi, C AGL(,F,). 


It follows that 2; € AGL(1,F,) by part (b) of Lemma 14.1.2. Since j was arbitrary 
and 7 € AGL(1,F,) by (14.8), it follows that G C AGL(1,F,)?AGL(1,F,). 7 


By Theorem 8.5.3, a polynomial is solvable by radicals if and only if its Galois 
group is solvable. Hence we have the following corollary of Theorem 14.2.15. 


Corollary 14.2.16 Let f € F[x] be irreducible and imprimitive of degree p*, and 
assume that F has characteristic 0. Then f is solvable by radicals over F if and only 
if the Galois group of f over F is isomorphic to a subgroup of the wreath product 
AGL(1,F,)? AGL(1,F,). . 


This corollary shows that the size of the Galois group of an irreducible solvable 
imprimitive polynomial of degree p? is bounded by 


|AGL(1, Fp) ?AGL(1,Fp)| = p?*'(p— 1)?" 
As p gets larger, this becomes very small in comparison with the size of S,2. 


Example 14.2.17 When p = 17, |S,72.| & 2.1 x 10°87 and |S172S17| * 8.3 x 1076! by 
Example 14.2.13. In contrast, 


|AGL(1,Fi7)? AGL(1,F,7)| = 17!216!8 = 6.6 x 10%. 
Hence, while a random polynomial of degree 17” can have a Galois group as large 


as S|, an irreducible solvable imprimitive polynomial of this degree has a much 
smaller Galois group. <> 


Mathematical Notes 


Here are some further remarks on wreath products. 


« Wreath Products. The wreath product defined in the text can be generalized as 
follows. Let G be any group and let A C S, be a permutation group. Then set 


AlG = {(73.81,---,8n) | 7 EA, 81,---58n € GH. 
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Following (14.7), we define a group operation on this set via 


(73815062 Bn)(T 8h <6 Bn) = (TT Br (1)B 1s 1Br'(n)Bn)- 


In Exercise 6 you will show that this makes A?G into a group that satisfies part (b) 
of Lemma 14.2.8. You will also show that if G is finite, then 


|AtG| = |A||G|". 


One surprise is that we can represent a wreath product as a semidirect product. See 
Exercise 7 for the details. Further information about wreath products can be found 
in (7, p. 81] and [14, pp. 219-228]. 


Historical Notes 


The term “primitive” is due to Galois, though he said “not primitive” instead of 
“imprimitive.” Like us, he began with polynomials. Here is his definition of “not 
primitive” [Galois, p. 163]: 

One calls equations not primitive the equations that are, for example, of degree 
mn, which decompose into m factors of degree n by means of a single equation 
of degree m. Such are the equations of Gauss. Primitive equations are those that 
do not possess such a simplification. 


To understand this, suppose that f € F [x] is separable of degree mn with splitting 
field F Cc L. Having “a single equation of degree m” means that we adjoin all roots 
of such a polynomial of degree m. This gives a subfield 


FCKCL 


such that F C K is Galois, and having f decompose “into m factors of degree n by 
means of” this subfield means that there is a factorization 


(14.12) f=[[f. fe xb. 


i=) 


We can relate this to our definition of imprimitive as follows. Let f € F [x] be monic, 
separable, and imprimitive, and assume also that f is irreducible (as is implicit in 
Galois’s definition). By Lemma 14.2.3, we can assume that the roots of f fall into 
m blocks R),...,Rm, each consisting of n roots (so that f has degree mn). Then let 
J, be the monic polynomial whose roots are the elements of R;. In Exercise 8 you 
will show that f; € K[x], where F C K is a Galois extension determined by the block 
structure of the roots. Hence we recover (14.12). 

In the above quotation, Galois claims that the cyclotomic equations considered by 
Gauss are imprimitive. You will prove this in Exercises 9 and 10. 

We also note that Galois applied the terms “primitive” and “not primitive” to both 
polynomials and groups. See, for example, [Galois, p. 79]. 
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Exercises for Section 14.2 


Exercise 1. Prove (14.7). 


Exercise 2. The wreath product S3 252 C Ss can be thought of as the subgroup of all per- 

mutations that preserve the blocks R; = {1,2},R2 = {3,4},R3 = {5,6}. As noted in Exam- 

ple 14.2.11, S32S2 has order 6-2? = 48. 

(a) Show that ($3 2.S2) MA6 has order 24. 

(b) Show that 53 ?S2 is the centralizer of (12)(34)(56) in Ss (meaning that S3 2S2 consists of 
all permutations in S_ that commute with (12)(34)(56)). 

(c) Use part (b) to show that 53 2.S2 is isomorphic to ((S3?.S2) MA6) x S2. 

See the next exercise for more on S3 2S» and (S32S2) MA6. 


Exercise 3. One of the challenges of group theory is that the same group can have radically 
different descriptions. For instance, S4 and the group G = (S32S2) MAe appearing in Exam- 
ple 14.2.11 both have order 24. In this exercise, you will prove that they are isomorphic. We 
will use the notation of Exercise 2. 

(a) There is a natural homomorphism G — $3 given by how elements of G permute the blocks 
R), R2,R3. Show that this map is onto, and express the elements of the kernel as products 
of disjoint cycles. 

(b) Use the Sylow Theorems to show that G has one or four 3-Sylow subgroups. 

(c) Show that A¢ has no element of order 6. 

(d) Use part (c) and the kernel of the map G > S3 from part (a) to show that G has four 
3-Sylow subgroups. 

(e) G acts by conjugation on its four 3-Sylow subgroups. Use this to prove that G ~ Sq. 

(f) Using Exercise 2, conclude that S32.S2 ~ Sq x So. 

We note without proof that S3?S2 ~ S4 x S2 is also isomorphic to the full symmetry group 
(rotations and reflections) of the octahedron. 


Exercise 4. Let A and B be solvable permutation groups. Prove that their wreath product A?B 
is also solvable. 


Exercise 5. This exercise will complete the proof of Theorem 14.2.15. 

(a) Let G; > Sp be the map defined in (14.9). Prove that it is a group homomorphism and 
that its image G; C Sp is transitive and solvable. 

(b) Leto = (7; j1,..-, 4p) and (p;1,...,p) be as in the proof of Theorem 14.2.15. Thus we 
have a fixed j such that i= r(j), 4; = 9, and p(i) = i. Now let y = (17! pr;A1,.-.,Ap) 
be as in (14.11). Prove carefully that A; = uy Oy). 


Exercise 6. Let A be a subgroup of S,, and let G be any group. Then define A?G as in the 
Mathematical Notes. 

(a) Prove that A?G is a group under the multiplication defined in the Mathematical Notes. 
(b) State and prove a version of part (b) of Lemma 14.2.8 for A?G. 

(c) Prove that |A?G| = |A||G|" when G is finite. 


Exercise 7. Let A?G be as in Exercise 6, and let H be the set of all functions 
@:{i,....n} 3G. 


(a) Given ¢,x € H, define dy € H by (dx) (i) = O(i)x(i). Prove that this makes H into a 
group isomorphic to the product group G”. 
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(b) Elements of A?G can be written (7,@), where ¢ € H. Prove that in this notation, (14.7) 
becomes 
(7, 6)(7',9') = (17', ((7')" 9) #*). 
(c) ACS, acts on {1,...,2}. Show that this induces an action of A on H via (7 -@)(i) = 
#(7~'(i)). Be sure you understand why the inverse is necessary. 
(d) The action of part (c) enable us to define the semidirect product H x A. Using the 
description of A?G given in part (b), prove that the map 


(7, a) nd (rT: $,T) 


defines a group isomorphism A?G ~ H =A. This shows that wreath products can be 
represented as semidirect products. 


Exercise 8. The goal of this exercise is to relate Definition 14.2.2 to Galois’s definition of not 
primitive. Let f € F[x] be monic, separable, and irreducible with splitting field F C L. Also 
assume that f is imprimitive with blocks of roots given by R1,...,Rm, where each block has n 
elements (thus deg(/) = mn). Let f; be the monic polynomial whose roots are the elements of 
R,, and let K C L be the fixed field of {o € Gal(L/F) | o(R;) = R; for all i}. 

(a) Show that f = ]]/"_, f: and that f; € K[x] for all i. 

(b) In Galois’s definition, K is obtained by adjoining the roots of a separable polynomial 
of degree m. In modern terms, Galois wants F C K to be a Galois extension such that 
Gal(K/L) is isomorphic to a subgroup of Sn. Prove that the field K defined in part (a) 
has these properties. 

See Exercise 14 for some examples. 


Exercise 9. Assume that G C S;, 1s transitive and Abelian. 

(a) Prove that |G| = n by considering the isotropy subgroups of G. 

(b) Prove that G is primitive if and only if |G| is prime. 

Thus a transitive Abelian permutation group is imprimitive unless it is cyclic of prime order. 


Exercise 10. Let &,(x) be the cyclotomic polynomial whose roots are the primitive pth roots 
of unity, where p is prime. We know that ®,(x) is irreducible of degree p— 1. In the quotation 
given in the Historical Notes, Galois asserts that ®, (x) is imprimitive. 

(a) Prove Galois’s claim for p > 3 using Exercise 9. 

(b) Explain why we need to assume that p > 3 in part (a). 


Exercise 11. Given a prime p, let Cp C S, be the cyclic subgroup generated by the p-cycle 
(12...p). As explained in the text, this gives the wreath product Cp?C, C S,2. Prove that 
Cp tC, is a p-Sylow subgroup of S,2. 


Exercise 12. Let f be an irreducible imprimitive polynomial of degree 6, 8, or 9 over a field 
of characteristic 0. Prove that f is solvable by radicals over F. 


Exercise 13. Let f = x° + bx? +c € F[x] be irreducible, where F has characteristic different 
from 2 or 3. We will study the size of the Galois group of f over F. 
(a) Show that f is separable. Thus we can think of the Galois group as a subgroup of Sz. 
(b) Show that x° + bx? +c is imprimitive and that its Galois group lies in S2?53. Also show 
that |S2 ?.S3| = 72. Thus the Galois group has order < 72. 
(c) Let F C L be the splitting field of f over F. Use the Tower Theorem to show that 
[L: F] < 36. Hence the Galois group has order at most 36. 
Using Maple, one can show that the Galois group of x® + 2x? — 2 over Q has order 36 and 
hence is as large as possible. 
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Exercise 14. Here are some examples to illustrate Galois’s definition of imprimitive. We will 
use the notation of Exercise 8. Let F be a field of characteristic different from 2 or 3. 

(a) Let f =x° + bx* +x? +d € F [x] be irreducible with splitting field F C L. Show that the 
splitting field of x? + bx’ + cx +d gives an intermediate field F C K C Lsuch that F C K 
is Galois and f = fifo fs, where f; € K[x] has degree 2 for i = 1,2,3. Also explain how 
K relates to the field K constructed in Exercise 8. 

(b) Work out the analogous theory when f = x° + bx’ +c € F |x] is irreducible. 


Exercise 15. Let G C S, be transitive. Prove that G is primitive if and only if the isotropy 
subgroups of G are maximal with respect to inclusion. 


Exercise 16. Let p be prime. The ring Z/p*Z is not a field, but one can still define the group 
AGL(1,Z/p*Z). Its action on Z/p*Z allows us to write AGL(1,Z/p*Z) C S,2. 


(a) Prove that AGL(1,Z/p*Z) is solvable and transitive of order p?(p — 1). 
(b) Prove that AGL(1,Z/p*Z) C S,2 is imprimitive. 


14.3 PRIMITIVE PERMUTATION GROUPS 


We now consider primitive permutation groups. Our main result is a powerful theorem 
of Galois on the structure of solvable primitive permutation groups. In order to prove 
this, we will define doubly transitive groups and use finite fields to construct some 
interesting permutation groups. We will also study the minimal normal subgroups 
of a solvable group. As an added bonus, we will learn why Galois was interested in 
finite fields. 

The theory developed in this section will also be used in Section 14.4 when we 
classify solvable primitive subgroups of S,.. 


A. Doubly Transitive Permutation Groups. For permutation groups, double 
transitivity is defined as follows. 


Definition 14.3.1 A subgroup G C S, is doubly transitive if whenever we have 
ii’, j,j’ € {1,...,n} such that 


ifi and j#i, 
there is o € G such that 
o(i)=j and o(i)=/'. 
We already know an example of a doubly transitive group. 


Example 14.3.2 In Section 14.1, we considered AGL(1,F,,) as a subgroup of S,. To 
prove that this is doubly transitive, consider i 4 i’ and j # j’, where we now regard 
these as elements of F,. Since i # i’, there are a,b € F, such that 


ait+b=j and ai'+b=j', 


and j # j’ implies that a 0. Thus the condition of Definition 14.3.1 is satisfied by 
Yap € AGL(1,F,,). We will generalize this example later in the section. <P> 
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The concepts of doubly transitive, primitive, and transitive are related as follows. 
Proposition 14.3.3 Let G C S, be a subgroup. Then: 


G doubly transitive => G primitive => G transitive. 


Proof: First suppose that G is doubly transitive and imprimitive. Then we have 
blocks R),...,Rx, where k > | and |R;| > 1 for some i. For this i, pick ij F ig in R; 
and also pick i3 € R; for some j # i. Then we have pairs i; # iz and i;  i3, so that 
by double transitivity we can find o € G such that 


o(i)) =i and a(i2) = 3. 


Now consider o(R;), which by assumption is one of the blocks R,,...,R,x. Then 
a(iy ) =i, ER; implies that o(Ri) = R;, while o(i2) =hE R; implies that a(Ri) = Rj. 
This contradiction proves the first implication. 

The second implication will be proved in Exercise 1. 2 


Doubly transitive permutation groups also have the following property. 
Proposition 14.3.4 If G C S,, is doubly transitive, then |G| is divisible by n(n — 1). 


Proof: Let P={(i,j)|1<i,j <n, iF j} be the set of pairs of distinct elements of 
{1,...,2}. This set has n(n — 1) elements, and G acts on P viag -(i, j) = (o(i),0(J)). 
The crucial observation is that G acts transitively on P because G is doubly transitive 
on {1,...,n}. Thus the G-orbit of any (i, /) € P has n(n — 1) elements. Using the 
Fundamental Theorem of Group Actions, we conclude that n(n — 1) divides the order 
of G. 2 


B. Affine Linear and Semilinear Groups. The finite fields introduced in 
Chapter 11 lead to some important permutation groups. Let F, be a finite field with 
q = p™ elements, p prime, and let Fj be the standard n-dimensional vector space 
over F,. As in the Mathematical Notes to Section 11.1, GL(n,F,) is the group of 
invertible n x n matrices with entries in F,. This acts on F} by matrix multiplication 
when elements of Ff are regarded as column vectors. 

Using GL(n,F,), we construct the larger group AGL(n, F,) of affine linear trans- 
formations, which are maps ‘, , : F? — Fj defined by 


Ya, (4) =Autv, AEGL(n,F,), ve Fy. 


Thus AGL(n, F,) combines linear maps with translations. Note that GL(1,F,) = F7, 
so that when g = p, AGL(1,F,) is the one-dimensional affine linear group studied in 
Sections 6.4 and 14.1. 


The group AGL(n, F,) contains the subgroups 


Fi ~{y,, |< Fj} C AGL(n, Fy), 


(14.13) 
GL(n,F,) ~ {%9|4 € GL(@,F,)} C AGL(n,F,), 
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where J, € GL(n, F,) is the identity matrix and 0 € KF; is the zero vector. For 
simplicity, we will write (14.13) as 


F7 C AGL(n,F,) and GL(n,F,) C AGL(1, Fy). 


In Exercise 2 you will show that F? is a normal subgroup of AGL(n, F,) with quotient 
isomorphic to GL(n,F,). You will also express AGL(n,F,) as a semidirect product 
via the action of GL(n, F,) on F?. 

By using the Galois group Gal(F,/F,), we can enlarge AGL(n,F,) as follows. 
An affine semilinear transformation is a map Y,_,,,: Fj — F? defined by 


Yaov\t) =Ao(u)t+v, Ae GL(n,F,), o € Gal(F,/F,), v € Fp. 


These maps form the affine semilinear group ATL(n, F,). 

When q = p, one sees that AT'L(n, F,) = AGL(n,F,), and when g = p”, m > 1, 
you will prove in Exercise 3 that AGL(n,F,) is normal of index m in AT'L(n, F,). 
Furthermore, we have inclusions 


F? Cc AGL(n,F,) Cc ATL(n, Fy), 


and F7 is a normal subgroup of A'L(n, F,) (see Exercise 3). 
These groups act on F”, which means that they can be regarded as subgroups of 
Sqr. As permutation groups, they have the following important properties. 


Proposition 14.3.5 The groups AGL(n,F,) and ATL(n,F,) are doubly transitive 
subgroups of Sq. They are also primitive. 


Proof: In Exercise 4 you will prove that AGL(n,F,) is doubly transitive when 
acting on F?. Hence the same is true for the larger group AT'L(n, F,). Both groups 
are then primitive by Proposition 14.3.3. . 


In Section 14.4, we will study solvable subgroups of S,2. Applying the above 
theory, we get subgroups 


14,14) q=p'andn=1 => Fy C AGL(I,F,2) C ATL(1,F,2), 
q=p andn=2 => F? C AGL(2,F,). 

However, F,2 is a vector space over F,, of dimension 2. In Exercise 3 you will show 

that elements of ATL(1,F,2) are linear when considered as maps between vector 

spaces over F,,. It follows that if we use a basis to identify F,2 with F?, then (14.14) 

gives the inclusions 

(14.15) FP =F, C AGL(1,F,2) C ATL(1,F,2) C AGL(2,F,) C S,2. 

C. Minima! Normal Subgroups. Before proving Galois’s theorem on solvable 

primitive permutation groups, we need to take a detour into pure group theory. 
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Definition 14.3.6 A normal subgroup N of a group G # {e} is minimal if N 4 {e} 
and all nontrivial subgroups of N (i.e., subgroups of N different from {e} and N) are 
not normal in G. 


Here are some examples of minimal normal subgroups. 


Example 14.3.7 Let n > 5. Then A, is clearly a minimal normal subgroup of S,, 
since A, is simple. <I> 


Example 14.3.8 The translation subgroup F) is a normal subgroup of the affine 
linear group AGL(n,F,), where as above we identify v € F? with the translation 
“1,,v € AGL(n,F,). Since F> is Abelian, any subgroup of F? is normal in F¥. But 
when is such a subgroup normal in AGL(n,F,)? To answer this, note that 


-1 _ 
Yaw? Viv? Yaw = Yh Av 


by part (b) of Exercise 2. Since GL(n, F,) acts transitively on F¥ \ {0} by part (c) 
of Exercise 4, it follows easily that if {0} A H C F? is normal in AGL(n,F,), then 
H =F). Thus F? Cc AGL(n,F,) is a minimal normal subgroup. <p 


Example 14.3.9 Consider the wreath product G = $2 2A; C Sy, where! > 5. Part (b) 
of Lemma 14.2.8 shows that the subgroup 


A, XA, © { {e541 H2) | mi € Ar} C S22A; = G 
is normal in G. We will regard N = A; x A; as a subgroup of G. In Exercise 5 you 


will prove that NV has the following properties: 


e The nontrivial normal subgroups of N are {e} x A; and A; x {e}. 
e The factors of A; x A; get permuted under conjugation by elements of 


S2~ {(r3e,e) |r € Sz} C $22A; = G. 


Now suppose that a nontrivial subgroup H C N = A; x A; is normal in G. Then 
H is normal in N, so that H = {e} x A; or A; x {e} by the first bullet. But these 
subgroups can’t be normal in G, by the second bullet. We conclude that N = A; x A; 
is a minimal normal subgroup of G. <> 


The minimal normal subgroups in these examples are simple (Example 14.3.7) or 
products of simple groups (Examples 14.3.8 and 14.3.9). The following result shows 
that this is no accident. 


Proposition 14.3.10 Let N be a minimal normal subgroup of a finite group G. Then 
there is a simple group A such that we have an isomorphism 


N~w~A"=Ax::-XA 
—— 
n times 


for somen> 1. 
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Proof: Let A be a minimal normal subgroup of N. Given g € G, set A, = gAg™!. 
Exercise 6 shows that A, is a minimal normal subgroup of N isomorphic to A. 

We will first prove that N ~ A" for some n > 1. If A=N, then we are done. 
So suppose that A # N. By the minimality of N, we know that A,, 4 A for some 
g1 © G. Since the intersection of normal subgroups of N is normal in N and since A 
is minimal in N, we must have Az, 0A = {e}. Then Exercise 7 implies that 


AA,, = {aa, |a EA, a) CA, } CN 


is a normal subgroup of N isomorphic to the product group A x A,,. If AA,, = N, 
then we are done, since Ag, ~ A. 

Suppose that AA,, # N. If A, C AAg, forall g € G, then it is easy to show that AAg, 
is normal in G (see Exercise 6). This is impossible by the minimality of N. Hence 
there is go € G such that Ay, ¢ AA,,. Then (AAg,)MAg, = {e}, since the left-hand 
side is normal in N and lies in the minimal normal subgroup A,,. Arguing as in the 
previous paragraph, N contains the subgroup 

AAg Ag, AX Ag, X Ag, XA’. 
If AA,,Az, = N, then we are done, and if not, we continue as above. In Exercise 6 
you will show that this eventually leads to the desired isomorphism N ~ A". 

It remains to prove that A is simple. The isomorphism N ~ A” takes A C N to 
Ax {e} x---x {e} CA". If B CA is a nontrivial normal subgroup, then N ~ A” 
takes B to the subgroup 


Bx {e} x---x {fe} CAXAX+:-XA=A", 


which is easily seen to be normal, since B is normal in A. It follows that B is normal 
in N. This is impossible, since A is a minimal normal subgroup of N. Hence A must 
be simple, and the proposition is proved. 7 


When N is a minimal normal subgroup of a solvable group G, the simple group 
A appearing in N ~ A” must also be solvable. The only solvable simple groups are 
cyclic of prime order, so that A ~ F, as groups. Thus we have proved the following 
corollary of Proposition 14.3.10. 


Corollary 14.3.11 Let N be a minimal normal subgroup of a finite solvable group. 
Then there is a prime p such that N = F} for some n > 1. a 


D. The Solvable Case. Before proving Galois’s structure theorem for solvable 
primitive permutation groups, we need some preliminary definitions and results. 

We first want to say more about inclusions such as AGL(n,F,) C Sq, which we 
obtain by identifying Fj with {1,...,qg"}. This is done carefully as follows. Given 
a set T, let S(T) = {y: T — T | ¢ is one-to-one and onto}. This is a group under 
composition, called the symmetry group of T. Here are some examples. 


Example 14.3.12 Since every affine linear or semilinear transformation of Fj is 
One-to-one and onto, we have natural inclusions 


(14.16) AGL(n,F,) C ATL(n, Fy) C S(E’). x 
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Example 14.3.13 For a more basic example, note that S({1,...,2}) is the symmetric 
group Sz. LP 


If T has £ elements, then there is a one-to-one onto map y: T > {1,...,@}. Itis 
easy to check that 7(y~) = yoyo! defines a group isomorphism 


Under ¥, a subgroup of G C S(T) maps to a subgroup of S¢. In Exercise 8 you will 
show that if we use a different map y’: T > {1,...,@}, then G maps to a second 
subgroup of S¢ conjugate to the first. 

In particular, a one-to-one onto map y : Fj > {1,...,q"} gives a group isomor- 
phism ¥ : S(Fj) ~ Sy. Applying 7 to (14.16) gives subgroups of Sy also called 
AGL(n,F,) and ATL(n, F,). Since + is not unique, these subgroups are only defined 
up to conjugacy in Sg. 

We next define what it means for a permutation group to be regular. Given a group 
G and g € G, define y, : G > G by ,(h) = gh. One easily shows that y, € S(G). 
SINCE Hy OY, = Yon, the mapping g r+ Ys gives an isomorphism 


G~ {, |g € G}CS(G). 


In general, if T is any set, then a subgroup G C S(T) is regular if there is a one-to-one 
onto map +y : G > T such that the isomorphism ¥ : S(G) ~ S(T) takes {y, | g € G} C 
S(G) toG Cc S(T). When T is finite, it follows that every regular subgroup of S(T) 
has |T| elements. 

Make sure you understand how this definition captures the idea that G C S(T) is 
regular when the action of G on T looks like the action of G on itself given by the 
group operation of G. Here are some examples. 


Example 14.3.14 Let G be a group with n elements. In Section 7.4 we used the 
Cayley table of G to show that G is isomorphic to a subgroup of S,. In Exercise 9 
you will show that this subgroup is regular. LP 


Example 14.3.15 Consider AGL(n,F,) C S(F7). If v € Fj, then ¢, is translation 
by v, so that when we identify v with translation by v, we see that the translation 
subgroup 


F” c AGL(n, Fy) 
is a regular subgroup of S(F}). Furthermore, if we use y: F7 ~ {1,...,q"} to regard 
Fj c AGL(n, F,) as subgroups of Sj, then F? is regular in Sqr. <P 


The following lemma will be useful in our proof of Galois’s structure theorem. 


Lemma 14.3.16 Suppose that G C Sz is a subgroup. Then: 
(a) If G is primitive and N # {e} is normal in G, then N is transitive. 
(b) If Gis transitive and Abelian, then G is regular. 
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Proof: For part (a), consider the orbits of N acting on {1,...,@}. Fix an orbit N- j, 
J € {i,...,2}, and take o € G. Since N is normal in G, we have 


o(N-j) =0No~' -o(j) =N-o(J). 


This shows that G preserves the block structure given by the orbits of N. Since Gis 
primitive, the block structure is trivial, so that either there is only one orbit or every 
orbit has only one element. The latter is impossible (since N # {e}), and the former 
implies that N is transitive. This proves part (a). 

Turning to part (b), consider the isotropy subgroup G; of j € {1,...,2}. We claim 
that G; = {e}. To prove this, let 7 € G and observe that 


G,(j) =TG;jT | =G;, 


where we use (A.19) and the fact that G is Abelian. Since G is transitive, we conclude 
that the isotropy subgroups of G are equal. Thus 7 € G; fixes not only j but also 
every element of {1,...,2}. Hence 7 =e, so that G; = {e}. 

In Exercise 10 you will show that a subgroup of S¢ is regular if and only if it is 
transitive with trivial isotropy subgroups. It follows that G is regular. a 


We can now prove the following great theorem of Galois. 


Theorem 14.3.17 Let G C S¢ be a solvable primitive permutation group. Then 
é= p", p prime, and (up to conjugacy) 


F,7 CGC AGL(n,F,) C Sp. 


Proof: Let N be a minimal normal subgroup of G. Since G is primitive, part (a) of 
Lemma 14.3.16implies that N is transitive, and since G is solvable, Corollary 14.3.11 
implies that N ~ F>. 

In particular, N is transitive and Abelian, so that N is regular by part (b) of 
Lemma 14.3.16. It follows immediately that £ = |N| = p”, as claimed in the theorem. 
Furthermore, since N ~ F7’, being regular means that N C S,» is the image of 


EF" c S(F") 


under the isomorphism ¥ : S(F¥) ~ Sp. coming from some one-to-one onto map 
y:F}  {1,...,p"}. Hence, to study N C GC Sp, we will consider 


Fi CG'CS(F5), 
where G’ maps to G under ¥. An element g € G’ C S(F}) gives a bijection 
(14.17) VEY ge, veF). 


We will show that (14.17) is affine linear, which will imply that G’ C AGL(n,F,). 
To describe how G’ acts on F?, let Gq C G’ be the isotropy subgroup of 0 € FY. 
Now consider (14.17) for g € Gp and write translation by v as 7, ,. Since F? C G’ is 
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normal, we have 8,8! = Yj, w for some w € FY. Using g-0 = 0, we compute the 
action of g on v as follows: 


g:v = g° (%,,7°9) = (8 %,,v) -0O 
= (8,78 |): (g:0) =, 0 =. 


This shows that the map v+> g-v corresponds to conjugation by g on the normal 
subgroup F7 c G’. Since conjugation is a group homomorphism, v +> g-v must 
also be a group homomorphism. Such a map is automatically linear over F, by 
Exercise 11. Thus (14.17) gives a element of GL(n,F,) when the latter is regarded 
as consisting of permutations of F’. In other words, any element of Gj is of the form 
‘Yao for some A € GL(n,F,). 

We now prove that G’ C AGL(n,F,). As above, translation by v € FY is +, ,. 
Given g € G’, letv=g-0. Then, _,g € G’ maps 0 to 0 and hence lies in Go. Thus 
Vi,,—-v8 = Yao Which implies that g =, ,Y40 = Ya, € AGL(n, F,). This shows that 
G' Cc AGL(n, F,) and completes the proof of the theorem. 7 


When applied to polynomials, Theorems 8.5.3 and 14.3.17 imply the following 
structure theorem for the Galois group of a primitive solvable polynomial. 


Corollary 14.3.18 Let F be a field of characteristic 0, and let f € F |x| be primitive. 
If f is solvable by radicals over F,, then f has degree p” for some prime p and integer 
n> 1, and the Galois group of f over F is isomorphic to a subgroup of AGL(n,F,) 
containing the translation subgroup F,. rT] 


Theorem 14.3.17 shows that a solvable primitive permutation group G satisfies 
FY CGCAGL(1,F,) C Sp. 


Furthermore, the final part of the proof shows that the isotropy subgroup Go of 0 € FY 
can be regarded as a subgroup of GL(n,F,) such that 


G={%,| VE Fy,A € Go}. 


Thus G is uniquely determined by Gp. So it makes sense to ask if there is anything 
special that we can say about Go. As we will see, the answer involves the following 
definition. 


Definition 14.3.19 Go C GL(n,F,) is irreducible if there is no nontrivial subspace 
V CE (ie., no subspace V # {0} and # F>) such that g(V) C V forall g € Go. 


Using this, we get the following useful result. 
Proposition 14.3.20 Assume that G is a permutation group satisfying 
F,; CGC AGL(n,F,) C Sp», 


and let Gp C GL(n,F,) be the isotropy subgroup of 0. Then: 
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(a) Gis primitive if and only if Go is irreducible. 
(b) G is solvable if and only if Go is solvable. 


Proof: For part (a), we will prove that G is imprimitive if and only if Go is reducible. 
First assume that G is imprimitive with blocks R1,...,Rx. Since F? C G and F> acts 
transitively on itself, we know that G is transitive. By Lemma 14.2.7, it follows that 
1<|Ri| =--- =|Ry| <p”. 

Suppose for simplicity that 0 € R,. We claim that R, is a subspace of F>. To 
prove this, take v € R, and observe that v-0 = v, since F? acts by translation. Since 
G preserves the blocks, we must have v- R; = R,, which means v+ w € R, for all 
w€R,. Since v € R; was arbitrary, R, is closed under addition and hence is a 
subgroup, because R; is finite. Exercise 11 then implies that R) is a subspace. 

However, every g € Go maps 0 to 0 and hence R, to Rj, since G preserves the 
blocks. This shows that R; is a nontrivial subspace of F, such that g(R)) = R, for 
all g € Go. Hence Gp is reducible. 

Conversely, suppose that there is a nontrivial subspace V such that g(V) C V for 
all g € Go. Then 1 < |V| < p", and g(V) = V for all g. Now let Rj,...,Rx be the 
cosets of V in F. In Exercise 12 you will show that G is imprimitive with respect to 
the blocks R,,...,R,. This completes the proof of part (a). 

The proof of part (b) is a straightforward application of the results of Section 8.1. 
See Exercise 12 for the details. |] 


Theorem 14.3.17 and Proposition 14.3.20 imply that classifying solvable primitive 
subgroups of S,. reduces to the study of solvable irreducible subgroups of GL(n, F,). 
We will use the n = 2 case of this strategy in Section 14.4 when we consider solvable 
primitive subgroups of S,2. 


Mathematical Notes 
Some important ideas from group theory appear in this section. 


= Multiply Transitive Groups. Besides transitive and doubly transitive groups, one 
can define k-transitive subgroups of S, for 1 <k <nas follows. A subgroup GC S, 
acts on the set P, of ordered k-tuples of distinct elements of {1,...,2} by 


a+ (ij,... ix) = (a(i1),---,0(i&)), a €G, (i,--- 54) € Py. 


Then G is k-transitive if G acts transitively on Py. In Exercise 13 you will show that 
S, is n-transitive and A, is (n — 2)-transitive, and in Proposition 14.3.5 we showed 
that AGL(n, F,») is 2-transitive (i.e., doubly transitive). 

An example of a 4-transitive group is the Mathieu group 


My, = ((2 10)(4 11)(57)(8 9), (143 8)(2569)) C Sy. 


This is a simple group of order 7920 and is the smallest sporadic group in the 
classification of finite simple groups. Some of the many interesting aspects of 
multiply transitive groups are discussed in [3, Ch. 7] and [7, Secs. 5.7, 5.8]. 
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« Finite Simple Groups. The group GL(n,F,) is finite whenever F, is a finite 
field. This leads to an interesting finite simple group as follows. First observe that 
GL(n, F,) contains the normal subgroups 


SL(n,F,) = {A € GL(n, F,) | det(A) = 1}, 
Fyl, = {Al | A € FG}, 


where /, is the n x n identity matrix. The group SL(n, F,) is normal because it is the 
kernel of the homomorphism det : GL(n,F,) > Fj, and F7'/, is normal because its 
elements commute with all n x n matrices. 

The projective linear group is the quotient group 


PGL(n,F,) = GL(n, F,)/Fq In 
which is also finite. Furthermore, inside this group we have the subgroup 
PSL(n,F,) C PGL(n, F,) 


consisting of all elements of PGL(n,F,) represented by an element of SL(n, F,). The 
remarkable fact is that PSL(n, F,) is almost always simple. 


Theorem 14.3.21 Let F, be a finite field and n> | be an integer. Then PSL(n,F,) 
is a simple group except when n = 2 and q = 2 or 3. a 


A proof can be found in [2] or [8]. In Exercises 14 and 15 you will show that 
PSL(2,F)) ~ $3 and PSL(2,F3) ~ Ag, which are not simple. You will also show that 


|PSL(2,F4)| =|PSL(2,Fs)|=60 and |PSL(2,F7)| =|PSL(3,F)| = 168. 


One can prove that PSL(2,F,) ~ PSL(2, Fs) ~ As and that every non-Abelian simple 
group of order < 200 is isomorphic to either As or PSL(2,F7) ~ PSL(3, F2) (see [8, 
Satz 6.15] and [14, pp. 106-107]). In Example 13.3.10 and Exercise 9 of Section 13.3 
we showed that the Galois group of x’ — 154x + 99 over Q is 


GL(3,F)) ~ PSL(3,F2). 


The paper [1] gives a nice description of the isomorphism PSL(2,F7) ~ GL(3,F,). 

Finally, we should mention that other finite simple groups can be constructed using 
matrices over finite fields. These groups play an important role in the classification 
of finite simple groups. See [6] for an introduction. 


« The O’Nan-Scott Theorem. Theorem 14.3.17 describes the structure of solvable 
primitive permutation groups. This is a special case of the O’Nan—Scott Theorem, 
which describes the structure of arbitrary primitive permutation groups. The O’ Nan— 
Scott Theorem is a basic tool in the study of primitive permutation groups. The full 
statement of the theorem (see [3, Ch. 4]) is beyond the scope of this book. 

However, it is possible to give a brief glimpse into what this theorem says. We 
need the following concept. 
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Definition 14.3.22 The socle of a finite group G is the subgroup H generated by the 
minimal normal subgroups of G. 


In Exercise 16 you will show that the socle is a product of finite simple groups. 
It is also obviously normal in G. For a primitive permutation group G C Sz, one 
can prove the stronger result that the socle H C Gis a transitive subgroup such that 
H ~ A™ for some finite simple group A. 

The O’Nan-Scott Theorem classifies a primitive permutation group G C S¢ ac- 
cording to its socle H ~ A”. There are two cases, each with several subcases: 


Regular Socle. If H is regular, then G falls into one of two classes: 

e Abelian Socle. A =F, and H =F" C GC AGL(m,F,). 

e Non-Abelian Socle. H ~ A™, where m > 6 and A is non-Abelian, and G is a 
“twisted wreath product” with restricted isotropy subgroups (see [3, Sec. 4.7]). 


Nonregular Socle. Here, H is non-Abelian and G falls into one of three classes: 
e Almost Simple. H =A, where A is non-Abelian, and A C GC Aut(A), where 

Aut(A) is the group of all automorphisms of A, and G/A is solvable. 

e Diagonal. H ~ A”, where m > 2, and G is a “subgroup of a wreath product with 

diagonal action” (see [3, Sec. 4.7]). 

e Product. H ~ A", where m > 2 and G is a “subgroup of a wreath product with 

product action” (see [3, Sec. 4.7]). 

One way to think of Theorem 14.3.17 is that it explains how solvable primitive 
permutation groups relate to the larger class of all primitive permutation groups: They 
fit into the “regular Abelian socle” class of the O’ Nan—Scott Theorem. 

The O’Nan-Scott Theorem has many applications in group theory. For example, 
we know from Section 14.2 that doubly transitive groups are primitive. One can show 
that doubly transitive groups belong to the Abelian Socle or Almost Simple cases of 
the O’Nan-Scott Theorem. This and the classification of finite simple groups lead 
to a classification of all doubly transitive permutation groups. See [3, Sec. 7.7] fora 
discussion of this result. 


Historical Notes 


Why did Galois invent finite fields? After all, the main focus of his research was 
on the roots of polynomials. This question is now easy to answer using Galois’s 
own words. Before giving the quotation, we recall from the Historical Notes to 
Section 11.1 that Galois considered elements of finite fields as “imaginary solutions” 
of congruences. In this language, here is what Galois had to say about the importance 
of finite fields [Galois, p. 125]: 


It is mainly in the theory of permutations ... that the consideration of 
imaginary roots of congruences appears to be indispensable. This gives a simple 
and easy method to recognize in which case a primitive equation is solvable by 
radicals, as ] will now try to give the idea in a few words. 

Given an algebraic equation fx = 0 of degree p”, suppose that the p” 
roots are denoted by x,%, where the index k has the p” values determined by the 
congruence b?” = b (mod. p). 


440 SOLVABLE PERMUTATION GROUPS 


Take any arbitrary rational function of V of the p” roots x,. One transforms 

this function by substituting everywhere the index k with the index (ak +b)”, 

a, b, r being arbitrary constants satisfying a” ~! = 1 b?” = b (mod. p) and r 

an integer. 
This is taken from Galois’s article on finite fields. In the second paragraph of the 
quotation, Galois explains how elements of S,- can be regarded as permutations of 
the finite field F,-. The function V in the third paragraph is an element of the splitting 
field of f, and the substitutions described by Galois form the affine semilinear group 
ATL(1,F,). The formula (ak +b)?’ differs from the definition of semilinear given 
in the text, but later in the article Galois explains that when using this group, 


... the value substituted for k in every index can be put in the three forms 
(ak +b)” = (a{k+b'})” =adk’ +p" = al (k+By". 


(See [Galois, p. 125].) The formula ak?’ +b” is the one we used to define 
ATL(1,F,-). This group uses both the field F,. and the Galois group Gal(F,» /F,). 

These quotes show that Galois’s reason for introducing AI'L(1,F,.) is that he 
wants to “recognize in which case a primitive equation is solvable by radicals.” 
Galois knew that AT'L(1,F,») is solvable and plays an important role in determining 
when a primitive polynomial is solvable by radicals. We will have more to say about 
this in the next section. 

Theorem 14.3.17 is the major result of this section and is due to Galois, though 
he stated only the polynomial version given in Corollary 14.3.18. In his letter to 
Chevalier written the night before his fatal duel, Galois describes his theorem as 
follows [Galois, p. 177]: 


1° In order that a primitive equation be solvable by radicals, it must be of 
degree p”, p being prime. 
2° All of the permutations of such an equation are of the form 


Xkiymy... / Xak+bit-om+...+f ayk+byl+ceym+...48,... 


k,l,m being v indices that take the p values indicating all of the roots. The 
indices are taken modulo p, that is to say, the roots are the same when one adds 
a multiple of p to one of the indices. 

The group obtained by using all substitutions of this linear form contains 


all together p"(p" — 1)(p" ~ p)...(p" — p"—) permutations. 


Notice how item 2° describes AGL(v,F,). Also observe that the final sentence 
replaces v with n. In Exercise 17 you will prove Galois’s assertion that 


(14.18) |AGL(n,F,)| = p"(p" — 1)(p" — p)---(p"— p"~"). 


We should also mention the observation of [12, p. 133] that Abel knew Galois’s 
assertion 1° about the degree of a primitive polynomial solvable by radicals. Here is 
how Abel stated the result [Abel, Vol. II, p. 222): 


If an irreducible equation of degree p, divisible by prime numbers distinct from 
each other, is solvable algebraically, then one can always decompose yz into 
two factors 41 and ji2, such that the given equation is decomposable into 41 
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equations, each of degree ji2, and whose coefficients depend on equations of 
degree py. 


When we compare this with Galois’s definition of primitive given in the Historical 
Notes to Section 14.2, we see that Abel is saying that if an irreducible polynomial f is 
solvable by radicals, then f is imprimitive whenever its degree is not a prime power. 
The above passage appears in an unfinished manuscript that Abel wrote shortly before 
his death. It shows how Abel was also struggling to understand what it means for a 
polynomial to be solvable by radicals. 

Finally, the simple groups coming from finite fields were first studied by Jordan. 
In 1870, Jordan gave an incomplete proof that PSL(n, F, ) is simple except for n = 2 
and p = 2 or 3. In his proof, Jordan used what we now call Jordan canonical form 
to study matrices in GL(n,F,). This canonical form uses the eigenvalues of the 
matrix, which are roots of the characteristic polynomial. Hence the eigenvalues lie 
in finite extensions of F,. This shows that more general finite fields arise naturally 
when analyzing GL(n,F,). Jordan went on to consider GL(n, F,»), though the first 
complete proof of Theorem 14.3.21 is due to Dickson in 1897. 


Exercises for Section 14.3 


Exercise 1, The goal of this exercise is to prove that primitive permutation groups are transitive. 
Assume that G C S, is primitive but not transitive, and derive a contradiction as follows. 
(a) Explain why n > 1. 
(b) Letthe orbits of G acting on {1,...,2} be Ri,...,.Ry (see Section A.4 if you have forgotten 
about orbits). Explain why k > 1 and why elements of G map every orbit to itself. 
(c) Conclude that G is imprimitive. Be sure to take into account the case when every orbit 
consists of a single element. 


Exercise 2, Let y, ,€ AGL(n, F,) be translation by v € Fj, and let y, , € AGL(n,F,) be 

arbitrary. 

(a) Prove that 77), = Y4-1 acl 

(b) Prove that 4 ,,°%, 0 Yaw = YN av 

(c) Part (b) shows that the translation subgroup Fj C AGL(n,F,) is normal. Prove that the 
quotient group AGL(n,F,)/F7 is isomorphic to GL(n, F,). 

(d) Prove that AGL(n,F,) is isomorphic to the semidirect product F7 » GL(n,F,), where 
GL(n, F,) acts on Fj by matrix multiplication. 


Exercise 3, Consider the affine semilinear group AT'L(n, F,) for g = p”. 

(a) Prove that AGL(n, F,) is anormal subgroup of AT'L(x, F,) of index m. 

(b) Prove that F7 is a normal subgroup of ATL(a, F,). 

(c) Prove that elements of ATL(n,F,) give maps Fj — Fj that are affine linear over F,. 


Exercise 4, Let F be any field. The definition of AGL(n,F,) given in the text extends to 
AGL(n, F). The goal of this exercise is to prove that AGL(n, F) is doubly transitive when we 
regard elements of AGL(n, F) as permutations of the vector space F”. 
(a) Use F” C AGL(n, F) to show that AGL(n, F) acts transitively on F". 
(b) Inside AGL(n, F), we have the isotropy subgroup of 0 € F". Prove that this isotropy 
subgroup is GL(n, F). 
(c) Prove that GL(n, F) acts transitively on F" \ {0}. 
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(d) Use Exercise 19 below to conclude that AGL(n, F) is doubly transitive. 


Exercise 5. Let A and B be non-Abelian simple groups. You will show that A x {eg} and 

{e4} x B are the only nontrivial normal subgroups of A x B. Let N C A x B be a normal 

subgroup different from {(e4,e8)}, A x {en}, and {e,} x B. 

(a) Prove that A x {eg} and {e4} x B are normal in A x B. Hence, if we can show that 
N =A XB, then we will be done. 

(b) Prove that we can find (a,b) € N such that eg #a € A and eg DEB. 

(c) Let (a,b) € N be as in part (b). Show that (aaia~'ay',es) € N for any a; € A. 

(d) Given e4 #4 a € A, prove that there is a; € A such that aa; 4 aa. Then combine this with 
parts (b) and (c) to show that NM (A x {es}) =A x {ep}. 

(e) Part (d) implies that A x {eg} C N, and the inclusion {e,} x B C N is proved similarly. 
Use this to prove that N = A x B. 

Exercise 18 will explore various aspects of this argument. 


Exercise 6. Let A C N be a minimal normal subgroup, where N is normal in a larger group G. 
Given g € G, we set A, = gAg!. 
(a) Prove that A, is isomorphic to A and is a minimal normal subgroup of N. 
(b) Fix gi € G and consider AAy,. By Exercise 7, we know that AA,, is a subgroup of N. 
Assume that Ag C AAg, for all g € G. Prove that AAg,, is normal in G. 
(c) Use the following idea to complete the proof of Proposition 14.3.10. Let & be the set 
of all subgroups of N of the form Ag, ---Ag, such that the map (ai,...,d,) +> d1+++dn 
defines an isomorphism 


Ag, X +++ X Ag, & Ag, ++ "Ag+ 


Note that A = A, € &. Then pick an element of .& of maximal order. 


Exercise 7. Let H and K be normal subgroups of a group G. Let HK = {hk |h € H,k € K}. 
(a) Prove that HK is a normal subgroup of G. 
(b) Assume that HK = {e}. Prove that hk = kh forallh CH,k Ee K. 
(c) As in part (b), assume that HK = {e}. Prove that the map H x K > HK defined by 
(h,k) + hk is a group isomorphism. 


Exercise 8. Suppose that yy’: T — {1,...,€} are one-to-one and onto. As explained in the 
text, these give isomorphisms 7,4’ : S(T) ~ Se. 
(a) Explain why o = yo (7)7! is an element of Sz. 
(b) Let o € S; be as in part (a), and let & : Se > Se be conjugation by o. Thus G(r) = aro! 
for rT € Sg. Prove that ¥ =a 07’. 
This proves that 7 and 4 differ by conjugation by an element of Sy. 


Exercise 9. Let G be a group of order n. In Section 7.4 we constructed a subgroup H C S, 
isomorphic to G. Prove that H is regular in S,. 


Exercise 10. A permutation group G C Sz is regular if there is a one-to-one onto map 

7:G— {1,...,2£} such that 7: S(G) ~ Se maps {p, | g € G} C S(G) toG C Se. Recall that 

‘Pe € S(G) is defined by y,(h) = gh for h € G. The goal of this exercise is to show that G is 

regular if and only if it is transitive with trivial isotropy subgroups. 

(a) Let G C S¢ be regular. Prove that G is transitive and that the isotropy subgroups of G are 
trivial. 

(b) For the rest of the exercise, assume that G is transitive with trivial isotropy subgroups. 
Define 7: G— {1,...,2} by y(7) =7(1) for 7 € G. Prove that this map is one-to-one 
and onto. 
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(c) The map ¥ of part (b) gives 7 : S(G) ~ Se. Show that ¥(y,) = g, and conclude that G is 
regular. 


Exercise 11. We can regard F; as both a group (under addition) and a vector space over F, 
(under addition and scalar multiplication). However, since we are over F,, scalar multiplication 
can be built out of addition. Use this observation to prove the following: 

(a) Any subgroup of F; is a subspace. 

(b) Any group homomorphism ¥ : F —> F? is linear. 


Exercise 12. This exercise will use the notation of the proof of Proposition 14.3.20. 
(a) Suppose that V C F, is a nontrivial subspace such that g(V) C V for all g € Go. Use the 
cosets of V in F; to prove that G is imprimitive. 
(b) Explain why F> is normal in G, and prove that G/F; ~ Go. Use this to prove part (b) of 
Proposition 14.3.20. 


Exercise 13. Consider the definition of k-transitive given in the Mathematical Notes. 
(a) Prove that S, is n-transitive. 
(b) Prove that A, is (7 — 2)-transitive when n > 3. 


Exercise 14. Consider the groups GL(2,F,), SL(2,F,), PGL(2,F,), and PSL(2, F,) defined 
in the Mathematical Notes. 

(a) Prove that |GL(2,F,)| = q(q—1)(q* — 1). 

(b) Prove that |SL(2,F,)| = |PGL(2, F,)| = ¢(q’ — 1). 

(c) Prove that PSL(2,F,) = SL(2,F,)/{+h}, and conclude that 


sq(q@—-1), 9#2", 


IPSL(2,F,)| = 
‘ aq@-1), q=2", 


(d) Compute |PSL(2,F,)| for g = 2,3,4,5,7. 
(e) Show that |GL(3,F2)| = |PSL(3,F2)| = 168. 


Exercise 15. Prove that GL(2,F,) = SL(2,F2) ~ PSL(2, F2) ~ S3 and PSL(2, F3) ~ Aq. 


Exercise 16. Let G be a finite group with socle H. Prove that H is isomorphic to a product of 
finite simple groups. 


Exercise 17. Prove Galois’s formula (14.18) for |AGL(n, F,)|. 


Exercise 18. Here are some observations related to Exercise 5. 
(a) Give an example to show that Exercise 5 is false if we drop the assumption that A and B 
are non-Abelian. 
(b) Let Aj,...,A, be non-Abelian simple groups. Determine all nontrivial normal subgroups 
of Ay X--» xX A,. 


Exercise 19. Let G C S, be transitive, and let G; be the isotropy subgroup of i € {1,...,7}. 
Thus Gj = {a0 € G| o(i) = i}. 
(a) Prove that G is doubly transitive if and only if G; acts transitively on {1,...,7} \ {i}. 
(b) More generally, let k > 2. Prove that G is k-transitive if and only if G; acts (k — 1)- 
transitively on {1,...,}\ {i}. 


Exercise 20. Let G C S, be doubly transitive. Proposition 14.3.3 implies that G is transitive. 
Prove that G is transitive directly from the definition of doubly transitive. 
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Exercise 21. Generalize (14.15) by showing that we have inclusions 
F,” = Fy» C AGL(n, Fp») C ATL(n, Fx») C AGL(nm, Fp) C Spm. 


Exercise 22. Show that AGL(n, F,) is isomorphic to the subgroup 
{(4 ‘) |A €GL(n,F,), v€ ry} Cc GL(n+1,F,), 


where (4 +) is the (n+ 1) x (2+ 1) matrix such that the upper left n x n comer is A, the first 
n entries of the last column are v, and the first n entries of the last row are all zero. 


Exercise 23. Use Theorem 14.3.21 to show that AGL(2, F,) is not solvable for p > 3. 


Exercise 24, The action of PGL(2,F) on F = F U {oo} was introduced in Section 7.5. In 
particular, Exercise 11 of that section implies that the isotropy subgroup of PGL(2, F) at the 
point 00 can be identified with AGL(1, F). Use part (c) of Exercise 4 and Exercise 19 to prove 
that the action of PGL(2, F) on F is 3-transitive (also called triply transitive). 


Exercise 25. Prove that AGL(1,F4) ~ Ag and ATL(1, F4) ~ Sa. 


Exercise 26. Compute the orders of the groups in (14.15). 


14.4 PRIMITIVE POLYNOMIALS OF PRIME-SQUARED DEGREE 


Let f € Fx] be a primitive polynomial of degree p*, where p is prime and F has 
characteristic 0. The main goal of this section is to understand which Galois groups 
can occur when f is solvable by radicals over F’. 

By Chapter 8 and Section 14.2, this is equivalent to classifying the solvable 
primitive subgroups of S,2 up to conjugacy. The answer is more complicated than in 
the imprimitive case. Instead of the single subgroup AGL(1,F,)?AGL(1,F,) C S,2 
used in Theorem 14.2.15, the primitive case will require three subgroups, denoted 
M,, M, and M3. 

Our strategy will be to first describe the M; and then show that every primitive 
solvable subgroup of S,2 is conjugate to a subgroup of one of them. The results 
of Section 14.3 imply that most of the proofs will take place in AGL(2,F,) and 
GL(2,F,). You will see a lot of 2 x 2 matrices in this section. 


A. The First Two Subgroups. The subgroups M, and M; are relatively easy to 
describe. The first subgroup is the affine semilinear group 


(14.19) M, = ALL(1,F,2) C Sp 
from (14.15). This subgroup has the following properties. 


Proposition 14.4.1 The subgroup M, = ATL(1,F,2) C S,2 is solvable, doubly tran- 
sitive, and primitive. Furthermore, |M,| = 2p?(p* — 1). 


Proof: n Exercise | you will prove that M; is solvable and compute its order. Then 
we are done, since M is doubly transitive and primitive by Proposition 14.3.5. = 
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The second subgroup is constructed as follows. A pair of affine linear transfor- 
mations 7, y/ € AGL(1,F;,) give 


6= (7,7): Fp oF; 
defined by 


(14.20) 5(a, 8) = (y(@),7'(8)). 


In Exercise 2 you will show that é is an affine linear transformation of F}. Thus we 
have an inclusion 


AGL(1,F,) x AGL(1,F,) C AGL(2,F,). 


The first AGL(1,F,) acts on the first coordinate of a point in F?, and the second 
AGL(1,F,) acts on the second coordinate. To get a more interesting group, we add 
the matrix (° }) that switches the coordinates. This gives the group 


(14.21) M2 = (AGL(1,F,) x AGL(1,F,),(9})) c AGL(2,F,) C S,2, 
where the last inclusion is from (14.15). This subgroup has the following properties. 


Proposition 14.4.2 The subgroup M2 C S,» described in (14.21) is solvable and, 
when p > 2, primitive. Furthermore, |M2| = 2p*(p—1)?. 


Proof: In Exercise 2 you will verify that (9 }) has order 2 and satisfies 
(9) (AGL. F,) x AGL(1,Fp)) (9)! = AGL(1, Fp) x AGL(1, Ep). 


It follows that AGL(1,F,) x AGL(1,F,) C Mp is a subgroup of index 2. From here, 
it is easy to compute |M2| and show that M) is solvable (see Exercise 2). 

It remains to prove that M» is primitive. First note that M2 contains the translation 
subgroup FS, since F, C AGL(1,F,). Hence, by Proposition 14.3.20, M2 is primitive 
if and only if the isotropy subgroup (M)o C GL(2,F,) is irreducible. In Exercise 2 
you will verify that (M>)o is generated by the matrices 
(14.22) (93),(0 2), A, We Kp. 

Let {0} # V C F? bea subspace such that 7(V) C V for all matrices + in (14.22). If 
we can show that V = FS, then (M;)o will be irreducible and we will be done. 

Take (a,b) # (0,0) in V. Using (9 2) from (14.22), we see that (Aa, ub) € V for 
all 4,4. € Fy. When a and b are both nonzero, this gives (p — 1)* elements of V. 
Since p > 2 implies that (p — 1)? > p, we conclude that V = F? in this case. On the 
other hand, if a = 0, then b # 0, and using (° }) from (14.22) shows that (b,0) € V. 
Adding this to (0,b) € V, we obtain (b,b) € V with both coordinates nonzero. Hence 


we are reduced to the previous case, so that V = FS. The case when b = 0 is handled 
similarly. 7 
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Notice how the proof of primitivity uses (9 }) € Mp. In fact, it is easy to see that 
the smaller group AGL(1,F,) x AGL(1,F,) is imprimitive (see Exercise 2). 
It is also interesting to compare the subgroups M, and M2. By Propositions 14.4.1 
and 14.4.2, we have 
\Mi| _ 2p?(p?—-1) _ p+ 
\M2|_ 2p*(p—1)?_ p- 1 
Thus |M;| > |M2|. In Exercise 3 you will show that when p > 3, M2 is not doubly 
transitive and is not isomorphic to a subgroup of M;. So M, and M2 are quite distinct 
as subgroups of S,2. 


B. The Third Subgroup. The third subgroup M; is harder to describe than the first 
two. We begin with a lemma about 2 x 2 matrices that will prove to be surprisingly 
useful. Recall that in any group G, the centralizer Cg(g) of g € G is the subgroup 
consisting of all elements of G that commute with g. Also let , € GL(2,F,) denote 
the identity matrix. 


Lemma 14.4.3 [fg ¢ GL(2,F,) \ Fyh, then 
Cox(2,F,) (8) = {m € GL(2,F,,) | m = al + bg for some a,b € Fy}. 


Proof: Every ah + bg € GL(2,F,) obviously commutes with g. Now take m in 
Cot(z,F,)(8)- Since g ¢ FZ h, you will prove in Exercise 4 that there is v € F? such 
that v and gv form a basis of F?. Hence there exist a,b € F, such that 


mv = av+bev= (al +bg)(v). 
Using mg = gm, we obtain 
mgv = g(mv) = g(av+ bev) = agv+ bg’v = (aly + bg)(gv). 


This implies that m = al, + bg, since their corresponding linear maps agree on a basis 
of F?. 5 
Pp 


The subgroup M3 is constructed using the projective linear group PGL(2,F,,) 
discussed in the Mathematical Notes to Section 14.3. The normal subgroup 


Fob = {Ah |rA€F>} C GL(2,F,) 
gives the quotient group 
PGL(2,F,) = GL(2,F,)/Foh, 
where the image of m = (4%) € GL(2,F,) in the quotient will be denoted 
[m| = [25] € PGL(2,F,). 


To define M3, we will construct a subgroup of PGL(2,F,,) isomorphic to S4 when 
p > 2. By Exercise | of Section 8.1, $4 has the normal subgroup 


((12)(34), (13)(24)) ~ (Z/2Z)’. 
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Our strategy for finding a subgroup of PGL(2,F,) isomorphic to S4 uses a carefully 
chosen subgroup isomorphic to (Z/2Z)?. Here is the precise result. 


Proposition 14.4.4 Assume that p > 2. Then: 

(a) There exist g,h € GL(2,F,) such that gh = —hg and det(g) = det(h) = 1. 

(b) Let g,h be as in part (a). Then g? = h? = —h, and [g],[h| € PGL(2,F,) generate 
a subgroup H such that 


H = ((g},[h]) ~ (2/22). 


Furthermore, the centralizer C(H) = Cpgi(2,r,)(H) (consisting of all elements 
of PGL(2,F,,) that commute with every element of H) satisfies 


and the normalizer N(H) = Npc1(2,k,)(H) satisfies 
(c) The subgroups H and N(H) defined in part (b) are unique up to conjugacy by 
elements of PGL(2,F,). 


Proof: In Exercise 5 you will prove that there are s,¢ € F, such that er4+re=-l, 


Then let 
0 -!l Ss t 
g= (; 0) and h= (; ‘) . 


One easily computes that g,h have the desired properties. This proves part (a). 

For part (b), we have g,h with gh = —hg and det(g) = det(h) = 1. We first show 
that g* = —I. Since det(g) = 1, the characteristic polynomial P(x) = det(g — xl) 
can be written P(x) = x? + ax+ 1. Then the Cayley-Hamilton Theorem implies that 


ge t+agt+h=0. 
(Do Exercise 6 if you didn’t study this in your linear algebra course.) Conjugating 
by hand using hgh—' = —g easily implies that 

g’—ag+h=0. 


Adding these equations and dividing by 2 (p > 2) implies that 
g? = —h, 


and reversing the roles of g and A gives h? = —h. Note also that neither g nor 
h lies in F7Ih since gh = —hg # hg (p > 2). It follows easily that the subgroup 
H = ([g},[A]) C PGL(2,F,) is isomorphic to (Z/2Z)?. 

To study C(#), first note that if m,m2 € GL(2,F,), then 


(14.23) mym2 = tmym, <> [m,|[m2] = [m2] [m4]. 
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One direction is obvious, since [+mz| = [m2|. For the other direction, observe 
that [m;|[m2] = [mz]|[m,] implies that mjm2 = Amzm, for some A € Fy. Taking the 
determinant of each side shows that \* = 1, so that A = +1. 

Now let [m] € C(H). Then mg = +gm and mh = +hm by (14.23). If both signs 
are +, then m € Coi(2,F,)(8) NCox(2,F,) (2). Lemma 14.4.3 implies that 


m=ah+bg=ch+dh, a,b,c,d€F,. 


If b £0, then g would be a linear combination of 2 and / and hence would commute 
with h. This is impossible, since gh = —hg and p > 2. Thus b = 0, which shows that 
mis a multiple of Jy. It follows that [m] = [| € H. 

On the other hand, if mg = —gm and mh = hm, then 


(mh)g = m(hg) = m(—gh) = (—mg)h = (gm)h = g(mh), 
(mh)h = (hm)h = h(mh), 


so that mh € Coz (2,F,)(8) VCoL(2,F,) (2). By the above paragraph, [mh] = [I], hence 
[m| = [h] € H since h? = —hh. The remaining possibilities for the signs are handled 
similarly and imply that [m] = [g] or [gh]. See Exercise 7 for the details. This shows 
that C(H) C H. The other inclusion is trivial, since H is Abelian. Thus C(H) = H. 

To describe N(H), first observe that N(H) acts on H by conjugation. The iden- 
tity element is fixed, so that conjugation permutes the three nonidentity elements 
[g], [A], |g]. It follows that we have a group homomorphism 


p:N(H) > Ss, 


where an element m € N(H) maps to the permutation of [g], [A], [gh] given by conju- 
gation by m. 

The kernel of y consists of those m € N(H) that conjugate every element of H to 
itself. In other words, Ker(y) = C(H), which is H by the above. Since |H| = 4, it 
follows immediately that |V(H)| < 24, with equality if and only if ¢ is onto. 

To prove that ¢ is onto, note that (J +g) (Jz — g) = 2h since g* = —Ih. Thus 


(h+8)7' = 7(h-8) 
since p > 2. Then conjugating h by /, + g gives 


(h+g)h(h+8) | =3(h+g)h(h—g) = }(h—hg + gh— ghg) 
= }(h+gh+gh+hg’) = gh, 


where we use gh = —hg and g” = —h. It follows easily that [J + g| conjugates [g] 
to itself and interchanges [h] and [gh]. Thus |/2 + g] is an element of N(H) that maps 
to a 2-cycle in $3. Similarly, [J, +A] conjugates {h] to itself and interchanges [g] and 
[gh], so that [/2 +h] € N(H) maps toa different 2-cycle. Since S3 is generated by any 
two distinct 2-cycles, we see that y is onto and |N(H)| = 24. 

The final step is to show that N(H) ~ S4. In Exercise 8 you will show that N(H) 
has four 3-Sylow subgroups. Then the action of N(H) on its 3-Sylow subgroups 
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gives a group homomorphism N(H) — Sq. In Exercise 8 you will prove that this map 
is an isomorphism. The proof of part (b) is now complete. 

For part (c), first suppose that g ¢ FF I satisfies g’ = —h. By Exercise 4, there is 
v € F? such that v, gv form a basis of F?. Since g takes v to gv and gv to g*v = —v, it 


follows that 
-1,_ {0-1 


where Q is the matrix whose columns are v and gv. This easily implies that all 
elements g ¢ Fh satisfying g* = —J) are conjugate. 

Now suppose that we have g,h and g’,h’ as in the statement of the proposition. 
Then the above paragraph shows that g and g’ are conjugate, say g = Qg’Q7!. 
Replacing g’,h’ with their conjugates by Q, we may assume that g = g’. We need to 
conjugate h to h’ in a way that preserves g. 

Since gh = —hg and gh’ = —h’g, it is easy to see that h~'h’ commutes with g 
and hence lies in C(g) = Cgi2,r,)(g)- Note also that det(h—'h’) = 1, since det(h) = 
det(h’) = 1. In Exercise 9, you will show that this implies that 


(14.24) h-'h! =det(m)m~*,  me€ C(g). 


Since g ¢ F372, Lemma 14.4.3 implies that m is a linear combination of J, and g. 
Using this together with gh = —hg and g* = —h, one easily computes that 


mhm=ch, céF,. 
Taking determinants gives c = --edet(m). Combining this with (14.24), we obtain 


mhm—! = (mhm)m~? = (+det(m) h)m~? 
= +h(det(m) m—*) = +h(h“'h’) = +h’. 


Since m € C(g), it follows immediately that [m] conjugates H = ([g],[h]) to H’ = 


({g], [A’]). This easily implies the corresponding statement for N(H) and N(H’). The 
proposition is now proved. a 


We can give explicit generators for the subgroup N(H) described in Proposi- 
tion 14.4.4. When p = | mod 4, we can find an element i € F> of order 4. In 
Exercise 10 you will show that N(H) is generated by the images of the matrices 


0 1 i 0 1-1 

14.25) (; 0) (j t): ¢ i): 

Replacing i with \/—1 € C in (14.25) gives the matrices from Example 7.5.10. There, 
we showed that the images of these matrices in PGL(2,C) generate the rotational 
symmetry group of the octahedron. Furthermore, this symmetry group is isomorphic 
to S4 by Exercise 10 of Section 7.5. So it is nice to see that the same matrices work in 
F, when p = 1 mod 4. Explicit generators for N(H) when p = 3 mod 4 are described 
in Exercise 10. 
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We can finally construct M3. Assume that p > 2 and consider the homomorphism 
na: AGL(2,F,) —> GL(2,F,) —> PGL(2,F,), 


where the first map takes +, , to A and the second is the quotient map that takes A to 
[A]. By Proposition 14.4.4, we have Sy ~ N(H) C PGL(2,F,). Then M; is defined 
to be the inverse image of this subgroup under 7. Thus 


(14.26) M; =7~'(N(H)) C AGL(2,F,) C S,2. 
The subgroup M3 has the following properties. 


Proposition 14.4.5 Let p > 2. The subgroup M3 C S, described in (14.26) is 
solvable and primitive. Furthermore, |M3| = 24p*(p — 1). 


Proof: Itis straightforward to show that M3 is solvable because S, is, and the order 
of M3 is also easy to compute. We leave this as Exercise 11. 

It remains to prove that M3 is primitive. By Proposition 14.3.20, it suffices to 
show that (M3)o is irreducible. First observe that (M3)o is the inverse image of 
N(H) C PGL(2,F,) under the quotient map GL(2,F,) + PGL(2,F,). Let V C F? 
be a one-dimensional subspace mapped to itself by (M3). Thus V is a simultaneous 
eigenspace for all elements of (M/3)9. We derive a contradiction as follows. 

Proposition 14.4.4 and the definition of M3 imply that (M3)o contains elements 
g,h such that gh = —hg. Now let v € V be nonzero. Then gh(v) = —hg(v). However, 
since v is an eigenvector for g and A, one easily sees that gh(v) = hg(v). This gives a 
contradiction since p > 2, and the irreducibility of (M3)o follows. a 


It is possible to describe M3 more explicitly. First, one can show that M3 is 
isomorphic to the semidirect product Fy ™ (M3). Furthermore, a careful description 
of the structure of (M3)o can be found in [16, Ch. 5], where (M3)o is denoted by My 
when p = | mod 4 and by M3 when p = 3 mod 4. 

One important observation is that except for certain small primes p, the subgroups 
M,, M2, and M; of AGL(2,F,) satisfy 


(14.27) M:¢M; wheni#F j. 


Hence we really need three subgroups. We showed above that (14.27) holds for My 
and M) when p > 3. Then comparing |M3| = 24p*(p—1) with |M,| = 2p?(p? — 1) and 
|M2| = 2p*(p — 1)? shows that M, Z M3 and M, ¢ M3 when p > 13. Furthermore, 
(M)o and (M2)o have Abelian subgroups of index 2, which easily implies that 
M3 ¢ M, and M; ¢ Mo. See Exercise 12 for more details, including a precise list of 
the exceptions to (14.27). 


C. The Solvable Case. We can now state our main result concerning solvable 
primitive subgroups of S,2. Since every subgroup of Sy = S4 is solvable, we will 
assume that p > 2. 
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Theorem 14.4.6 Let G C S,2 be primitive, where p > 2 is prime. Then the following 

are equivalent: 

(a) Gis solvable. 

(b) Gis conjugate to a subgroup of one of the groups M,, M2, M3 defined in (14.19), 
(14.21), (14.26), respectively. 


Proof: The proof of (b) = (a) is easy, since we know that M,, M2, and M3 are 
solvable by Propositions 14.4.1, 14.4.2, and 14.4.5. 

To prove (a) = (b), first note that G is conjugate to a subgroup of AGL(2, F,,) con- 
taining F?, by Theorem 14.3.17. Furthermore, Proposition 14.3.20 implies that the 
isotropy subgroup Go of 0 € Fy is irreducible and solvable. It is also straightforward 
to show that G is uniquely determined by Gp C GL(2,F,). Thus it suffices to prove 
that in GL(2,F, ), Go is conjugate to a subgroup of (M;)o, i € {1,2,3}. 

We also note that we can assume that F7'J2 C Go. To see why, note that matrices 
in F¥J, commute with all elements of GL(2,F,). This makes it easy to see that the 
subgroup of GL(2,F,) generated by Go and FJ, is solvable (you will prove this 
carefully in Exercise 13). If this larger group lies in some (M;)p up to conjugacy, then 
so does Go. Hence we may assume that FJ, C Go. In particular, FJ, is an Abelian 
normal subgroup of Go. 

Let A C Gp be an Abelian normal subgroup containing F> 7, of maximal order. 
The proof now breaks up into two cases, depending on A. 


Case 1: First suppose that A 4 Fy}, and pick g € A\ Fy. Then consider the 
centralizer 
C(g) = Cox2,F,) (8) 
and its normalizer 
N(C(g)) = Noxz,F,) (C(8))- 


Our strategy will be to first prove that 
(14.28) Go C N(C(g)) 


and then show that N(C(g)) is conjugate to either (Mj) or (M2)o. 

To prove (14.28), take m € Go. Then mgm™' € A, since g € A and A is normal 
in Go. We also know that A C C(g), since A is Abelian. Hence mgm! € C(g). By 
Lemma 14.4.3, this implies that 


mgm'=alt+bg, a,beF,. 


Now take an arbitrary element h € C(g). Using Lemma 14.4.3 again, we can write 
h=cl, + dg, where c,d € F,. Then 


mhm~! = m(cl)+dg)m~' = cl, +dmgm™! 


=cl,+ d(al, + bg) = (c+da)h+dbg. 


This lies in C(g) by Lemma 14.4.3. Thus m normalizes C(g), so that m € N(C(g)). 
This completes the proof of (14.28). 
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The next step is to study N(C(g)). Here, our main tool will be the characteristic 
polynomial P(x) = det(g — x/,) of g. This is a quadratic polynomial with coefficients 
in F,. There are three possible behaviors for P(x): 

e P(x) is reducible and separable. 

e P(x) is reducible and nonseparable. 

e P(x) is irreducible. 

We will consider each possibility separately. 


Reducible and Separable. We will show that N(C(g)) is conjugate to (M2)o. By 
hypothesis, the eigenvalues of g are a 4 in F,, which means that g is diagonalizable 
(be sure you can explain why). Hence there is Q € GL(2,F,,) such that 


oso'=(§ 5): 


If we replace Go with its conjugate @GpQ™', then we may assume that 


_fa 0 
& _ 9) B . 
We will show that N(C(g)) = (M2)o in this situation. 
Using Lemma 14.4.3, it is easy to see that 


(14.29) C(g) = { (5 .) | [ve Ks} 


(see Exercise 14). Now let m= (22) € N(C(g)). Then mgm—' € C(g), which by 
the above description of C(g) implies that 


a O\ , fp 0 
™(o sn '=(6 >) 


where pp # v because a # @. If we multiply on the right by m and compare entries, 
then it is straightforward to show that b= c = 0 ora=d = 0. Hence 


_fa oO _ {0 b\ (0 1\\ fe O 
™=\o d) * ™=\ce oJ \i 0) \0 a)’ 
Since (M2)o is generated by the matrices (14.22), it follows that m € (M2)o. Thus 


N(C(g)) € (M2)o- 


The opposite inclusion is straightforward to prove (see Exercise 14). We conclude 
that N(C(g)) = (M2)o. 


Reducible and Nonseparable. We will show that this case can’t occur, since Go 
is irreducible. By hypothesis, the only eigenvalue of g is a € F,. Hence there is 


Q € GL(2,F,) such that 
_ a £B 
QQ — (5 ) ’ 
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where 3 # 0 because g ¢ F3,. Now replace Go with @GoQ™' and note that Go 
remains irreducible. Hence we may assume that 


(2) 


In Exercise 15, you will use Lemma 14.4.3 to show that 


(14.30) C(g) = { (5 ") | uv €F,, b #0}, 


and you will prove that the normalizer of C(g) is 


(14.31) N(C(g)) = { (5 x) | U,V, EE, wd #0}. 


Also recall from (14.28) that Go C N(C(g)). 
We obtain a contradiction as follows. Let VC F? be the subspace spanned by 


the vector (3) € F?. Since every element of (14.31) takes V to itself, we see that 
Go © N(C(g)) cannot be irreducible. This gives the desired contradiction. 


Irreducible. We will show that N(C(g)) is conjugate to the subgroup (M1 )o. Since 
M, = ATL(1,F,2), it is easy to see that (Mj )o is the group 'L(1,F,z) consisting of 
semilinear maps 7, , : Fz + Fy,a€F%, o € Gal(F,2 /F,), defined by 


(14.32) Ya,c(4) =ao(u), ueFp. 


While 7, _, need not be linear over F,2, it is always linear over F, (do you see why?). 
To represent 7, , as an element of GL(2,F,), we will use an isomorphism 


. ~ [r2 
(14.33) T: Fp ~F? 


of vector spaces over F,. Let Autg, (F,2) be the group of vector space isomorphisms 
F 2 — Fp that are linear over F,. Then we have a group isomorphism 


(14.34) Auty, (F,2) ~ GL(2,F,) 
where an F,,-linear isomorphism ¢ : F,2 — F,2 maps to the matrix representing 
TogoT!:F? +F? 
(you will verify this in Exercise 16). Under (14.34), the subgroup 
PL(1,F,2) C Autg, (F,7) 


maps to 
(M1)o C GL(2,F,). 


Different isomorphisms T in (14.33) give different isomorphisms (14.34) that are 
related by conjugation in GL(2,F,,) (see Exercise 16). 
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By assumption, g € GL(2,F,) has irreducible characteristic polynomial P(x). To 
analyze N(C(g)), we will make a special choice of T in (14.33). Consider the 
following bases of F,2 and F?: 


e Since P(x) has degree 2, it splits completely in F,2. Let a € Fz be a root, and 
note that a ¢ F,, since P(x) is irreducible over F,. Then 1, a form a basis of F,2 
as a vector space over F,,. 


e Write g = (2°), and observe that c # 0, since otherwise P(x) would have roots 
a,d € Fy. Then (;), (¢) form a basis of F? as a vector space over Fy. 


Using these bases, define T : F,2 ~ F> by 


T(1)=() and T(a)= (4). 


We claim that for this choice of T, the element of Autg,(F,2) corresponding to 
g € GL(2,F,) via (14.34) is multiplication by a. 
More precisely, define 7, : F,2 > F,2 by 7,(8) = of for 8 € Fz. We must show 
that 
Toy,0T '=g, ie, Toy, =g0T, 


where we now think of g as the linear map given by matrix multiplication. To prove 
this, first note that 


(14.35) T oyq(1) = T(a) = (0) = (24) () =8() =8°7())- 
If the characteristic polynomial of g is P(x) = x” +ax +b, then 
g +agt+bh=0 


by the Cayley—Hamilton Theorem. Using this and T(a) = go T(1) from (14.35), we 
obtain 
goT(a) = g?0T(1) = (—ag — bh) oT (1) = —agoT(1) —bT(1) 
= —aT(a) —bT(1). 


Since a? + aa +b = 0, we also have 
T 07y,(a) = T(a’) = T(—aa —b) = —aT(a) — bT(1). 


Thus T 07,(a@) = goT(a). This and (14.35) imply that Toy, = goT. We conclude 
that g corresponds to +, under (14.34), as claimed. 

It follows that N(C(g)) C GL(2,F,) corresponds to N(C(7,,)) C Autr, (F,2) under 
(14.34), where in the latter inclusion, the centralizer and normalizer are now computed 
relative to Autg, (I,2). Thus, if we can prove that 


(14.36) N(C(yq)) =TL,F,2) 


when a € F,2 \ F,, then it will follow that N(C(g)) = (Mi )o. Be sure you understand 
this. 


PRIMITIVE POLYNOMIALS OF PRIME-SQUARED DEGREE 455 


We now prove (14.36). In Exercise 17 you will show that if a € Fp \ F,, then 
(14.37) C(Yq) = {ah + by, € Autg, (F,2)| a,b € FL} = {y| 8 EF}. 


Fix m € N(C(7,)). Then moy,om7! € C(7,), So that mo, 0m! = yg for some 
8 €F%. This implies that 


m(a) =mo7,(1) = ygom(1) = Bm(1), 
m(a?) =mo7,(a) = ¥g0m(a) = Bm(a) = B’m(1), 
where the last equality of the second line uses the first line. Thus 
0 = m(0) = m(a? +aa +b) = m(a’) +.am(a) +bm(1) 
= B’m(1) +aBm(1) +bm(1) = (6? +48 +b)m(1). 
Since m(1) 4 0 and F,2 is a field, we must have 
(14.38) B?+aB+b=0. 
To relate this to TL(1,F,2), write the Galois group of F,2 over F, as 
Gal(F,»/F,) = {e,o} ~ Z/2Z, 


where e is the identity and o has order 2. Then the roots of P(x) = x? +ax+b are a 
and o(a). Hence (14.38) implies that 


B=a or B=a(a). 
In Exercise 17 you will show that if we set 6 = m(1), then 


B=a => mM=7;=7 E€TL(1,F,:), 
(14.39) 5 = V,e (1,F,2) 
B=o(a) => m= 7;00=5, €TL(1,F,) 


in the notation of (14.32). This proves that N(C(y,)) C TL(1,F,2). The opposite 
inclusion is straightforward (see Exercise 17), and (14.36) follows. 


Case 2: Next assume that F¥J, is the maximal Abelian normal subgroup of Go 
containing F7J,. We will show that Go is conjugate to a subgroup of (M3). First 
note that (M3)o C GL(2, F,) is the inverse image of the subgroup N(H) C PGL(2,F,) 
from Proposition 14.4.4. Let Gj C PGL(2, F, ) be the image of Gp C GL(2, F,). Since 
F> 1, C Go, it suffices to prove that Gy C N(H) after a suitable conjugation. 

Fix a minimal normal subgroup B’ Cc Gj as defined in Section 14.3, and let B be 
the inverse image of B’ in GL(2, F,). Since F7 C Go, we have 


Fil C BC Go. 


Note also that B is normal in Gp. We now prove some basic facts about B’ and B. 
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For B’, first note that Gg is nontrivial, since Go is irreducible (be sure you can fill 
in the details). Then B’ is a minimal normal subgroup of the nontrivial group Go. 
This has two useful consequences: 


e B’ is generated by the conjugates (with respect to Gj) of any of its nonidentity 
elements (Exercise 18). 


e B’ is Abelian, since Gg is solvable (Corollary 14.3.1 1). 


(The solvability of G is used twice in the proof of Theorem 14.4.6: at the beginning 
of the proof to reduce to Gp C GL(2,F,), and here to imply that B’ is Abelian.) 

For B, recall that its center Z(B) consists of all elements of B commuting with 
every element of B. We claim that Z(B) is as small as possible, i.e., 


(14.40) Z(B) = Fr. 


To see why, observe that Z(B) is normal in Go because B is (you will prove this in 
Exercise 18). Note also that Z(B) is Abelian and contains F7/,. But the hypothesis 
of Case 2 is that A = F>J,, which means that FJ, is the maximal Abelian normal 
subgroup of Go containing F7J,. The equality (14.40) follows immediately. 

The next step is to find some interesting elements of B. More precisely, we claim 
that there are g,h € B such that 


(14.41) gh=—hg, det(g) =det(h) = 1. 


To prove this, take [mm] € B’, [m,] 4 [I)]. The conjugates of [7,] generate B’, so that B 
is generated by FJ, and the conjugates of m,. Since m ¢ Fy Iy, (14.40) implies that 
m, doesn’t commute with at least one of its conjugates, say m2. Then g = mim "CB 
has det(g) = 1, since det(m,) = det(mz). It is also easy to see that g doesn’t commute 
with m), so that g ¢ Fy /,. Hence the conjugates of [g] generate B’, which means that 
B is generated by FJ, and the conjugates of g. Arguing as above, g has a conjugate 
h such that gh 4 hg. Also note that det(h) = 1. Since [g][h] = [A][g] (B’ is Abelian), 
(14.23) implies the desired equation gh = —hg. 

Let g,h € B satisfy (14.41). Then [g] and [h] generate the subgroup H defined in 
Proposition 14.4.4. Thus 


HCB CG. 


Since B’ is Abelian, we have B’ C C(H) = H, where the last equality is by Proposi- 
tion 14.4.4. Thus H = B’. Since B’ is normal in Go, we also have Gy C N(B’) = N(H). 
As noted at the beginning of Case 2, this completes the proof of the theorem. . 


A much more sophisticated proof of Theorem 14.4.6 can be found in [17, §21]. 
This reference studies solvable subgroups of GL(n, F,¢) for arbitrary n and ¢. 

When p is large, Theorem 14.4.6 implies that solvable primitive subgroups of S,2 
are relatively small in size. Here is an example. 
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Example 14.4.7 When p = 17, Propositions 14.4.1, 14.4.2, and 14.4.5 imply that 
the orders of M,, M2, M3 are 


|M,| = 2-17?(17? — 1) = 166464 = 1.7 x 10°, 
\M2| = 2-17°(17 —1)* = 147968 = 1.5 x 10°, 
|M3| = 24-17°(17 —1) = 110976 = 1.1 x 10°. 
By Theorem 14.4.6, solvable primitive subgroups of 5,72 are extremely small when 


compared to |5j72| = 2.1 x 10°87. In contrast, recall from Example 14.2.17 that the 
largest solvable imprimitive subgroup of S,7 has order 


|AGL(1,Fi7)? AGL(1,Fi7)| = 17'816'8 ~ 6.6 x 10%. 
Thus being solvable and primitive is much more restrictive than being solvable and 
imprimitive. <p> 
Combining Corollary 14.2.16 and Theorem 14.4.6, we get the following criterion 
for when an irreducible polynomial of degree p is solvable by radicals. 


Corollary 14.4.8 Let f € F(x] be irreducible of degree p*, where F is a field of 

characteristic 0. Then f is solvable by radicals over F if and only if either 

(a) f is imprimitive and the Galois group of f over F is isomorphic to a subgroup 
of the wreath product AGL(1,F,)tAGL(1,F,), or 

(b) f is primitive and the Galois group of f over F is isomorphic to a subgroup of 
the groups M,, M, and My defined in (14.19), (14.21), and (14.26). r 


Mathematical Notes 
This section includes some interesting ideas from group theory. 


= Solvable Linear Groups. For most of the proof of Theorem 14.4.6, we worked 
with the group Gp C GL(2,F,). From this point of view, the argument showed that 
every solvable irreducible subgroup of GL (2, F,) is conjugate to a subgroup of (Mj )o, 
(M2)o, or (M3)o. A systematic approach to the study of solvable linear groups can be 
found in [9] and [17]. 


= Doubly Transitive Solvable Permutation Groups. In Proposition 14.4.1, we 
showed that M, = ATL(1,F,2) is solvable and doubly transitive. What is more 
surprising is that, with some exceptions for small primes, this group contains all 
solvable doubly transitive subgroups of S,2. 


Theorem 14.4.9 Let p > 23 be prime. Then every solvable doubly transitive sub- 
group G C Sp is conjugate to a subgroup of M, = ATL(1,F,.). 


Proof: Since Gis solvable, Theorem 14.4.6 implies that G is conjugate to a subgroup 
of M,, M2, or M3. Furthermore, since G is doubly transitive, Proposition 14.3.4 
implies that |G| is divisible by p*(p? — 1). However, 


\Mz|=2p?(p—1)? and |Ms| = 24p?(p —1) 
are not divisible by p?(p? — 1) when p > 23. This proves the theorem. ri 
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The following much stronger result was proved by Huppert in 1957. 


Theorem 14.4.10 Let G C S¢ be solvable and doubly transitive. Then € = p™ for 
some prime p. Furthermore, if p™ ¢ {37,57 ,7?,117,237,3+}, then G is conjugate to 
a subgroup of ATL(1,F,»). 


Proof: Our hypothesis implies that G is solvable and primitive, and then £ = p™ by 
Theorem 14.3.17. This is the easy part of the proof. For the rest of the argument, see 
(11, §7 of Ch. XI]. | 


In fact, one can prove that up to conjugacy, all solvable doubly transitive permu- 
tation groups lie in AT'L(1,F,-), except for the 13 groups described in [10]. 


«= Classifying Permutation Groups. Besides the two classes of groups just discussed 
(solvable linear groups and solvable doubly transitive groups), there has been a lot of 
work on classifying other sorts of interesting groups. Here is a brief sample of what 
has been done: 


e Solvable primitive subgroups of S, forn < 256. See [16]. 

e Primitive subgroups of S,, for n < 1000. See [13]. 

e Transitive subgroups of S,2 for all primes p. See [4]. 

e All subgroups of PSL(2,F,»). See [2, Ch. XII] or [8, §8 of Ch. II]. 


The last bullet has some unexpected relations with Section 7.5 and Lemma 14.4.4. 
See Exercise 19 for an interesting subgroup of PSL(2,F,) when p = 1 mod 8. 


Historical Notes 


Galois worked very hard to understand solvable primitive subgroups, though 
his research was incomplete at the time of his death. In the Historical Notes to 
Section 14.3, we gave quotations from Galois’s paper on finite fields describing 
APL(1,F,») and his version of Theorem 14.3.17, which asserts that up to conjugacy, 
a solvable primitive group G satisfies 


F, CGC AGL(n,F,) C Sp. 


In this paper, Galois notes that ATL(1, F,») is solvable and that any polynomial whose 
Galois group is a subgroup of this group is solvable by radicals. He also makes the 
following intriguing statement [Galois, p. 125]: 
This remark would be of little importance if I had not already demonstrated 
that reciprocally, a primitive equation would not be known to be solvable by 
radicals, without satisfying the conditions that I have just stated. (I exclude 
equations of the 9th and 25th degree.) 


Galois seems to be saying that, with a few exceptions, a solvable primitive permutation 
group satisfies 


(14.42) Gc ATL(1,F,). 


As we know from Theorem 14.4.6, this is not correct, for the groups M2 and M; 
are counterexamples when n = 2. On the other hand, if we replace “primitive” with 
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“doubly transitive,” then we get a statement close to Huppert’s theorem about doubly 
transitive solvable groups (Theorem 14.4.10). Furthermore, in his letter to Chevalier, 
Galois indicates that the above assertion is “too restricted. There are few exceptions, 
but there are some” [Galois, p. 177]. So it is hard to know exactly what Galois 
was thinking. Nevertheless, the results of this chapter make it abundantly clear that 
Galois’s insight into permutation groups was nothing short of astonishing. The reader 
may wish to consult [12] for further discussion of these issues. 

The proof of Theorem 14.4.6 given in the text is based on suggestions of Walt 
Parry and Jordan’s 1868 paper Sur la résolution algébrique des équations primitives 
des degré p* (p étant premier impair) (Jordan2, pp. 171-195]. Jordan was aware 
that his results provide counterexamples to some of Galois’s assertions. 

Readers interested in learning more about the history of transitive permutation 
groups should consult the introduction to [13] and Appendix A of [16]. 


Exercises for Section 14.4 


Exercise 1. Prove that M; = AT'L(1,F,2) is solvable, and compute its order. 


Exercise 2. This exercise will study the subgroup Mz C AGL(2,F,) defined in (14.21). 

(a) Prove that the map 6 defined in (14.20) gives an element of AGL(2,F,). 

(b) Prove that (9 }) has order 2 and normalizes AGL(1,F,) x AGL(1,F,) C AGL(2, F,). 
(c) Prove that M2 is solvable, and compute its order. 

(d) Prove that (M2)o is generated by the matrices in (14.22). 

(e) Prove that AGL(1,F,) x AGL(1,F,) C AGL(2, F,) is imprimitive in S,2. 


Exercise 3. Let M; and M2 be the groups defined in the text, and assume that p > 3. Prove 
that M2 is not doubly transitive and not isomorphic to a subgroup of M;. 


Exercise 4. Let V be a vector space of dimension 2 over a field F, and let T: V > V bea 
linear map that is not a multiple of the identity. Also assume that T is an isomorphism. Prove 
that there is v € V such that v and T(v) form a basis of V over F. 


Exercise 5. Fix a € F,, p > 2. The goal of this exercise is to find s,t € F, with rt+r=a. 

(a) Let S= {s* | s € Fp}. Prove that |S| = (p+ 1)/2. 

(b) Let S’ = {a—s’ |s € Fp}. Show that SMS’ # @, and use this to prove the existence of 
s,t € F, such that 5° +1 =a. 


Exercise 6. Let A = (2°) be a2 x 2 matrix with entries in a field F. 
(a) Prove that the characteristic polynomial of A is P(x) = x? — tr(A)x+det(A), where 
tr(A) = a+d and det(A) = ad ~ be are the trace and determinant of A. 
(b) Prove that P(A) — A” — tr(A)A + det(A)/, is the zero matrix. 
The Cayley—Hamilton Theorem generalizes part (b) by showing that P(A) is the zero matrix 
when P(x) is the characteristic polynomial of an n x n matrix A. 


Exercise 7. Complete the proof of C(H) = H from Proposition 14.4.4 begun in the text. 


Exercise 8. Let G be a group with a normal subgroup H ~ (Z/2Z)? such that Cg(H) = H 
and the map G — Aut(H) given by conjugation is onto. The goal of this exercise is to prove 
that G ~ S4. Note that |G] = 24 by the proof of Proposition 14.4.4. 
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(a) Use the Sylow Theorems to show that G has one or four 3-Sylow subgroups. Then use 
Ca(H) = H to show that the number is four. 

(b) Let A; be a 3-Sylow subgroup of G. Use part (a) and the Sylow Theorems to show that 
its normalizer has order 6. 

(c) Now consider the homomorphism ¢ : G + Sq given by the action of G by conjugation on 
the 3-Sylow subgroups. Use part (b) to prove that Ker(#) cannot contain an element of 
order 3. 

(d) Conclude that the image of ¢ contains Ag. It follows that if ¢ is not an isomorphism, then 
G contains a normal subgroup of order 2. 

(e) Prove that G cannot contain a normal subgroup of order 2. Thus 6: G ~ Sy. 


This exercise is closely related to Exercise 3 of Section 14.2. 


Exercise 9. Let g and C(g) be as in the proof of part (c) of Proposition 14.4.4. 

(a) Show that C(g) is Abelian and contains Fy h. 

(b) If m € C(g), then it is easy to see that det(m)m—? € C(g). By part (a), it follows 
that ¢(m) = det(m)m~? defines a group homomorphism ¢ : C(g) > C(g). Prove that 
Ker(d) = rh; and [Im(4)| = |C(g)|/(p— 1). 

(c) Prove that Im(¢) C {w € C(g) | det(w) = 1}. 

(d) Explain why we may assume that g = (? ~4)- Then use Lemma 14.4.3 and Exercise 5 to 
show that det : C(g) — F? is onto. Conclude that 


Im(¢) = {w € C(g) | det(w) = 1}. 


The equality proved in part (d) shows that every element of C(g) of determinant 1 is of the form 
det(m) m—? for some m € C(g). This will be used in the proof of part (c) of Proposition 14.4.4. 


Exercise 10. Consider the subgroup N(H) C PGL(2,F,) defined in Proposition 14.4.4. 
(a) Prove that the images of the matrices (14.25) generate N(H) when p = 1 mod 4. 
(b) Prove that generators of H and the images of the matrices 


1 -1 4 Ss t-—1 
1 oa) AM Net Hs 
from [17, p. 163] generate N(H1) when p = 3 mod 4. 


Exercise 11. Let M3 be as in Proposition 14.4.5. Show that M; is solvable of order 24p”(p—1). 


Exercise 12. Consider the subgroups Mi, M2, and M3 defined in the text. 
(a) Show that (M1)o and (M2 )o have Abelian subgroups of index 2, and use this to prove that 
neither can contain (M3)o. This proves that M3 ¢ M, and M3 ¢ M2. 
(b) Explain why M3 = AGL(2,F3) when p = 3. 
(c) Show that (M,)o/F> J, has an element of order p + 1, and use this to prove that M, Z M3 
when p > 3. 
(d) Show that M2 ¢ M3 when p > 5. 
(e) Show that Mz C M3 when p = 5. 
It follows that the only exceptions to (14.27) are M; C M3 and M2 C M3 when p = 3 and 
M2 C M3; when p = 5. This result is due to Jordan. 


Exercise 13. Let Gp C GL(2,F,) be solvable. Prove that the subgroup generated by Go and 
F, & is also solvable. 


Exercise 14. Let g = (¢ 3), where a,8 € Ff and a # 8. 
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(a) Prove (14.29). 

(b) Let m= (45) € N(C(g)). In the argument following (14.29), we claimed that b = c = 0 
or a =d = 0. Supply the missing details. 

(c) Prove that (M2)o C N(C(g)). 


Exercise 15. Prove (14.30) and (14.31). 


Exercise 16. Let V,W be vector spaces over a field F, and let Autr(V) be the group of vector 
space isomorphisms V ~ V. Also let T : V — W be a vector space isomorphism. 
(a) Prove that ¢-+ TogoT! induces a group isomorphism 77 : Auts(V) ~ Auts(W). 
(b) Let 7’ : V -» W be another isomorphism. Prove that there is ® € Aute(W) such that 
T’ = ®oT. In the notation of part (a), ye : Autr(W) ~ Autr(W) is conjugation by ©. 
(c) In the situation of part (b), prove that y7- = ye o7r. 


Exercise 17. Fix a € F,2 \ Fp, and let y, be as defined just before (14.35). 
(a) Prove (14.37) and (14.39). For (14.37), you should use the argument from the proof of 
Lemma 14.4.3. 


(b) Prove that TL(1,F,2) C N(C(7,)). 


Exercise 18. Let M be a finite group. 
(a) LetA C M bea minimal normal subgroup, and let g 4 e be in A. Prove that A is generated 
by the elements hgh”! as h varies over all elements of M. 


(b) Let A C M be a normal subgroup. Prove that the center Z(A) of A is normal in M. 


Exercise 19. In the Mathematical Notes, we mentioned that all subgroups of PSL(2,F,) are 
known up to conjugacy. We will doa small part of this classification by proving that PSL(2, F,) 
contains a subgroup isomorphic to S4 when p = 1 mod 8. To begin, note that by Exercise 10, 
the images of the matrices (14.25) generate a subgroup of PGL(2,F,) isomorphic to S4. 

(a) Explain why F> has an element ¢ of order 8. Then i= C has order 4. 


(b) Compute (1 +2)? and use this to prove that there is a € F, such that a? = 2. 

(c) Show that the matrices (14.25) lie in SL(2,F,) after multiplication by suitable elements 

of F>. Hence their images generate a subgroup of PSL(2,F,) isomorphic to Sq. 

(d) Over C, C, = cos(21/8) + isin(27/8) = (1 +1)/V2. How does this relate to part (b)? 
More generally, one can prove that if g = p™ and p > 2, then PSL(2,F,) always contains a 
copy of Aq, and it contains a copy of S, if and only if g = +1 mod 8 (see [2, Ch. XII] or [8, §8 
of Ch. II]). You should compare the list of groups given in these references with (7.29), which 
asserts that the finite subgroups of PSL(2,C) = PGL(2,C) are cyclic, dihedral, or isomorphic 
to Ag, S4, or As. 


Exercise 20. Assume that g, ¢ GL(2,F,) satisfy gh = —hg and det(g) = det(h) = 1, as in 
part (a) of Proposition 14.4.4. Also assume that p > 2. 

(a) Prove that the subgroup (g,4) C GL(2,F,) is isomorphic to the quaternion group Q = 

{+1,+i,+j,+k}, where i? = 7? =k? = -1,ij = —ji=k, and —1 € Z(Q). 

(b) Prove that (M3)o is the normalizer of (g,/) in GL(2, F,). 
The quaternion group is an example of an extraspecial 2-group. The normalizer of an extraspe- 
cial 2-group in GL(2, F,) is part of Aschbacher’s classification of subgroups of GL(, F,). This 
is explained (briefly) in [13]. 
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CHAPTER 15 


THE LEMNISCATE 


The lemniscate is the plane curve defined by the equation (x? + y”)? = x? — y?. Here 
is a picture: 


We will consider the Galois groups of polynomials arising from division of the 
lemniscate into arcs of equal length. This will allow us to prove the following 
wonderful theorem of Abel [Abel, Vol. I, p. 314]: 


One can divide the entire circumference of the lemniscate into m equal parts by 
ruler and compass alone, if m is of the form 2” or 2” + 1, the last number being 
at the same time prime; or as well if m is a product of several numbers of these 
two forms. 
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Abel goes on to say that this theorem is “precisely the same as that of M. Gauss, 
relative to the circle.” You will verify this in Exercise 1. 

To prove Abel’s theorem, we will study doubly periodic functions of a complex 
variable and the theory of complex multiplication. We will also learn why Eisenstein 
proved his irreducibility criterion. 


15.1 DIVISION POINTS AND ARC LENGTH 


To formulate Abel’s theorem on the lemniscate carefully, we need to define the 
n-division points of the lemniscate and study the arc length of this curve. 


A. Division Points of the Lemniscate. In Section 10.2 we used the nth roots 
of unity to determine when a regular n-gon can be constructed by straightedge and 
compass. In terms of the unit circle centered at the origin, the nth roots of unity 
divide the circle into n segments of equal length, starting from (1,0). For n =S, the 
fifth roots of unity 1,¢;,¢2,¢3,¢2 give the picture: 


fe ‘ 


@ 

5 CG 
In general, the nth roots of unity ¢!, i=0,...,2 —1, are the n-division points of the 
unit circle. Then Gauss’s theorem of Section 10.2 can be restated as the assertion that 
the n-division points of the unit circle are constructible by straightedge and compass 
if and only if n is a power of 2 times a product of distinct Fermat primes. 

Abel, following hints of Gauss, asked the same question for the lemniscate. Here, 
the n-division points of the lemniscate are obtained as follows. Begin at the origin 
and follow the curve into the first quadrant, down into the fourth quadrant, back 
through the origin into the second quadrant, down into the third quadrant, and finally 
back to the origin. As we do this, we mark those points that give one-nth of the total 
arc length, two-nths of the arc length, etc. For n = 5, this gives the picture: 
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The n-division points divide the lemniscate into n segments of equal length. When 
n is odd, as in the above picture, the middle segment straddles the origin. When n 
is even, the n-division points are symmetric about the x- and y-axes, with the middle 
division point at the origin. For n = 6, the 6-division points give the picture: 


The n-division points on the lemniscate will lead to some remarkable polynomials 
analogous to the cyclotomic polynomials. The Galois theory of these polynomi- 
als will enable us to understand when the n-division points can be constructed by 
straightedge and compass. 

At the beginning of the chapter, we defined the lemniscate using the Cartesian 
equation (x? + y*)? =x? — y?. In Exercise 2 you will show that in polar coordinates, 
the lemniscate is given by the equation 


(15.1) r’? = cos(26). 


The polar coordinate r will play a central role in this chapter. One reason is that in 
order to construct a point on the lemniscate, we only need r. This might seem obvious 
in that we get the desired point (and its mirror images about the x- and y-axes) by 
intersecting the lemniscate with the circle of radius r. But the lemniscate is actually 
unnecessary. In other words, if 0 < r < 1 is constructible in the sense of Section 10.1, 
then so are the x- and y-coordinates of the four points on the lemniscate of distance r 
from the origin. To see this, we use (x? + y*)? =x? — y” and r? = x?+y?. This gives 
the equations 

Pax*—y? and r? =x? 4+y?. 


Solving for x and y in terms of r, we obtain 


ratyYP+r) and y= 4/4 (2—r4). 


Since the constructible numbers form a subfield of C closed under square roots, we 
see that x and y are constructible when r is. Thus, to prove that a given point on 
the lemniscate is constructible by straightedge and compass, it suffices to show that 
the corresponding polar coordinate r is constructible. Also note that the converse 
holds: if x and y are constructible, then so is r = ,/x? + y?. We have thus proved the 
following result. 


Proposition 15.1.1 Let P be a point on the lemniscate, and let r be the distance from 
P to the origin. Then P can be constructed by straightedge and compass if and only 
if r is a constructible number. 2 
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B. Arc Length of the Lemniscate. The n-division points of the lemniscate 
are defined in terms of arc length. Hence we need to study the arc length of the 
lemniscate. By (15.1), the polar equation of the lemniscate is 


r? = cos(26). 


If we focus on the first quadrant, then we get the picture: 


1 


Solving the above equation for 4 gives 6 = 5cos~!(r?). This makes @ into a function 
of r. Note that @ decreases from 7 to 0 as r increases from 0 to 1. 
Recall that arc length in Cartesian and polar coordinates is given by 


= Vdx?+dy? = Vdr?+r*d@?. 


It follows that the arc length of the lemniscate from the origin to the point in the first 
quadrant with polar coordinates (ro, 8) is given by 


| d6\? 
= 2( — 
arc length [ l+r ( ~) dr. 


Differentiating r? = cos(26) with respect to r gives 2r = —sin(26) -2%, so that 

dé r 2 r' 
1 =1+r(-—_) =1 
+r (5) - tr ( nay) * Sin?(28) sin?(20) | 


Since sin*(2@) = 1 — cos?(20) = 1 — r*, we obtain 
dé r4 t 
1 (ey =1 ; 
tr r tT l-r 1—-rF 


Hence our arc length formula becomes 


° 1 
(15.2) arc len n= [ ——— dr 
ee So i= 
The yin (15.2) is improper when rp = 1. Since it converges (see Exercise 3), 
ine ~1/2 dr is the arc length of the first-quadrant portion of the lemniscate. In 


the vighteonth century, this number was denoted 3 , where c is a variant of the Greek 
letter 7. Thus 


——— dr = 2.62206. 
2-2 om 
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It follows that the arc length of the lemniscate is 27 and the arc length between 
successive n-division points is 22 
We will write (15.2) as 


r 1 
15.3 s= —— dt, 
(15.3) i — 


where s represents the arc length along the lemniscate from the origin to the point in 
the first quadrant with polar coordinates (r,@). Then (15.3) expresses s as a function 
of r. Following Abel the inverse function will be written r = p(s), so that 


1 
1-14 


dt. 


(15.4) r=g(s) <= s= ff 
0 


Since 
O<r<1 correspondsto 0<s<F, 


we see that ¢p is defined on the interval [0, #]. In Section 15.2 we will extend ¢ to 
a periodic function on R, and in Section 15.3 we will further extend y to a doubly 
periodic meromorphic function on C. 

In particular, when n > 4, the first n-division point of the lemniscate lies in the first 
quadrant. Since its arc length from the origin is 22 Proposition 15.1.1 implies that 
the first n-division point is constructible by straightedge and compass if and only if 


ro= (27) 
is a constructible number. In Section 15.2 we will develop multiplication formulas 
for yp(ns), n € Z and use them to show that: 


e y(22) is the root of a polynomial with coefficients in Z. 


e y(22) is constructible if and only if all n-division points are constructible by 
straightedge and compass. 


In Section 15.5 we will consider the Galois group of the extension 
Q(i) c Q(t, p(7Z)). 


The appearance of i = \/—1 is unexpected but will make perfect sense once we study 
the complex multiplication formulas for p((n + im)s), n+ im € Z[i], in Section 15.4. 
Using this and some clever ideas of Eisenstein, we will then be able to prove Abel’s 
theorem on the lemniscate. 


Mathematical Notes 
Here are comments about two topics from this section. 
= Integrals and Inverse Functions. The definition of Abel’s function r = ¢(s) 


involves an integral defining s in terms of r and then an inverse function to get r in 
terms of s. 
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The idea of an inverse function of an integral is more common than you might 
expect. For example, one standard definition of e* is to first define the natural 
logarithm via the integral 


*] 
in(x) = f zat x>0, 
1 


and then define e* to be the inverse function of In(x). So e* is the inverse function of 
an integral. Another example from calculus is the indefinite integral 


1 _ 
| aseren la)+C. 


In terms of definite integrals, this can be written 


Xx 
1 
. 1 
sin7!(x) = | —=<dt, -1<x<1. 
(x) i Aa 


The inverse function of sin”! (x) is of course sin(x). So sin(x) is the inverse function 
of an integral. 

Now comes an intriguing idea. Suppose that we knew neither sin nor sin~'. How 
can we understand the integral f (1 —x?)~ 1/2 dx? One way would be to define sin(x) 
to be the inverse function of 


* 1 
xH ——= dt. 
[ v1-# 


Furthermore, if we define cos(x) = #sin(x) and m = 2 fo (1 —2?)~'/? dt, then all 
standard properties of sin(x) and cos(x) can be derived from these definitions. 

One way to regard (15.4) is that Abel’s function y(s) is obtained by applying the 
same idea to the integral f (1 —x*)~!/? dx. There are many analogies between sin(x) 
and y(s), and it is possible to develop the properties of these functions in parallel. 
This is done nicely in [12, pp. 240-243]. See also [16, pp. 1-9]. 


« Elliptic Integrals and Elliptic Functions. The integral f(1 —x*)~!/? dx is an 
example of an elliptic integral, and elliptic functions are inverse functions of elliptic 
integrals. In general, an elliptic integral is an indefinite integral of the form 


(15.5) [—s 


C(x) + D(x) Pn) 


where A(x), B(x),C(x), D(x) are polynomials in x, and P(x) is a polynomial in x of 
degree 3 or 4. If P(x) has degree 1 or 2, then the integral (15.5) can be evaluated using 
standard techniques of integration. So elliptic integrals, where P(x) has degree 3 or 
4, are the next integrals to consider. It follows that f(1 —x*)~'/? dx is an especially 
simple example of an elliptic integral. 

We will say more about elliptic integrals and elliptic functions in the next section. 
For now, we conclude with another example of an elliptic integral. A standard ellipse 
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: oe : : 2 2 
with center at the origin is given by an equation of the form % + 2 = 1. Assume 


thata = 1 andO<b< 1. If we set k = V1 — b’, then in Exercise 4 you will show 
that the ellipse 


has arc length given by 


2 Re Ja a=) Py 
(15.6) NWe@ ax [ —* axa [YORE a 


This special case of (15.5) is where the “elliptic” in “elliptic integral” comes from. 


Historical Notes 


The lemniscate first appeared in the mathematical literature as part of the ovals of 
Cassini, described by the French astronomer Cassini in 1680. In Cartesian coordi- 
nates, the ovals are the family of curves defined by the equation 


(15.7) ((x—a)?+y”) ((x+a)? +y?) = 0%. 


The lemniscate we’ve been studying corresponds to a = b = 1/ V2 (Exercise 5). In 
general, a < b gives a dumbbell-shaped curve and a > b gives two ovals, as in the 
following picture: 


Unaware of Cassini’s work, in 1694 Jacob (or James) Bernoulli gave the equation 


of the lemniscate as 
AX Tb VY = AV/AX — yy. 


He described the curve as having “the form of a figure 8 on its side, as of a band folded 
into a knot, or of a lemniscus, or of a knot of a French ribbon.” Here, “lemniscus” is a 
Latin word (taken from the Greek) meaning a hanging ribbon attached to the garland 
worn by the winner of an athletic contest. 

Bernoulli was led to this curve by an indirect route. In 1691 he encountered the 
integral 2 fo (1 —t*)~!/? dt (this should look familiar) in his study of the elastic curve. 
Torepresent this geometrically, he looked for a curve defined by an algebraic equation 
whose arc length equals 2 fo (1 —t*)~'/? dt. In 1694 he showed that the lemniscate 
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has the desired arc length, using the polar description of the lemniscate, as we did 
earlier in the section. The reader should consult [2} for a discussion of the elastic 
curve and Bernoulli’s priority dispute with his brother Johann, who independently 
discovered the lemniscate in 1694 in a different context. 

Bernoulli’s use of polar coordinates to compute the arc length of the lemniscate 
represents the first use of arc length in polar coordinates. It is ironic that calculus 
students study the lemniscate in one part of the course and arc length in polar 
coordinates in another, but they never put the two together, since the resulting integral 
cannot be evaluated by the usual methods of calculus. 

Our discussion shows that the lemniscate and its arc length were well known by the 
beginning of the eighteenth century. Thus Abel’s theorem on dividing the lemniscate 
into arcs of equal length deals with a topic familiar to the mathematical community 
of the time. 

We will say more about elliptic integrals and elliptic functions in the Historical 
Notes to the next section. 


Exercises for Section 15.1 


Exercise 1, Prove that the numbers described in Abel’s theorem at the beginning of the chapter 
are precisely those in Theorem 10.2.1, provided we replace “‘product of several numbers” with 
“product of distinct numbers” in Abel’s statement of the theorem. 


Exercise 2. Show that in polar coordinates, the equation of the lemniscate is r? = cos(26). 


Exercise 3. Prove that the two improper integrals {) (1 —1*)~'/?dt and f°, (1 14)" at 
converge. 


Exercise 4. Prove the arc length formula stated in (15.6). 
Exercise 5. Show that (15.7) reduces to (x? + y”)? = x? — y? when a = b = 1/V2. 


Exercise 6. Let n > 0 be an odd integer, and assume that the n-division points of the lemniscate 
can be constructed with straightedge and compass. Prove that the same is true for the 2n- 
division points. Your proof should include a picture. 


Exercise 7. Recall that in Greek geometry, the ellipse is defined to be the locus of all points 
whose sum of distances to two given points is constant. Suppose instead we consider the locus 
of all points whose product of distances to two given points is constant. Show that this leads 
to (15.7) when the given points are (a,0), (—a,0) and the constant is *. 


15.2 THE LEMNISCATIC FUNCTION 
In (15.4) we defined Abel’s function y(s) by 


| 
15.8 = ~ | —2_@. 
(15.8) r=(s) => 5 i Wont 


Since s represents arc length from the origin along the first quadrant portion of the 
lemniscate, we see that 7(s) is defined on [0, $], where 
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1 
| 
w=2 dt. 
[ vi-14 


In this section, we will extend (s) to a function on R of period 2 and show that it 
satisfies some remarkable addition and multiplication formulas. We will also apply 
these formulas to straightedge-and-compass constructions on the lemniscate. 


A. A Periodic Function. Our first task is to define y(s) as a function of period 
2@ on R. We will do this by extending the arc length interpretation of y(s) given in 
Section 15.1. 

The arc length parametrization of the lemniscate is defined by sending a real 
number s to the point P on the lemniscate such that: 


e If s =O, then P is the origin. 


e If s > 0, then move from the origin into the first quadrant portion of the lemniscate 
and continue along the curve until we reach the point P whose cumulative arc 
length from the origin is s. 


e If s <0, then move from the origin into the third quadrant portion and continue 
until we reach the point P whose cumulative arc length from the origin is —s. 


We call s the signed arc length variable of the lemniscate. When |s| is large, we 
may need to loop around several times before reaching the point P. Since the total 
arc length of the lemniscate is 2a, we see that s and s+ 2a give the same point on 
the lemniscate for any s € R. This is similar to measuring angles on the unit circle, 
where s and s+ 27 give the same point on the circle. 

The lemniscate is r? = cos(20) in polar coordinates. Recall that r is allowed to 
be negative as well as positive or zero. We will restrict 4 to lie in [—4, 7], so that 
0 <r<1 gives the right half of the lemniscate and —1 < r < 0 gives the left half. 
We call r the polar distance of the corresponding point on the lemniscate. Strictly 
speaking, r is really the signed polar distance, since r is negative on the left half of 
the lemniscate. We will use the shorter term “polar distance” for simplicity. 

Now consider (15.8). It is easy to see that the signed arc length s satisfies 


r 1 
s= —— dt 
[ vi-t4 


3 and —1 <r< 1. This implies that (15.8) can be used to define ¢(s) 
< ¥. In other words, for s in this range, Abel’s function (s) is simply 
the polar distance (with the above convention on r) of the point on the lemniscate 
with signed arc length s. 

It is now easy to extend ¢ to all of R: given s € R, y(s) is the polar distance of 
the point on the lemniscate whose signed arc length from the origin is s. Thus 


p(s) =r, 
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where s and r are related according to the diagram: 


Note that y(s) has period 2a, since s and s+2a@ give the same point on the 
lemniscate. Furthermore, (s) also satisfies the following identities: 


>) p(w -s) = os). 
The first follows because s and —s correspond to points on the lemniscate symmetric 
about the origin, and the second follows because s and @— s correspond to points 
symmetric about the x-axis (recall that each half of the lemniscate has length w). See 
Exercise | for the details. Using the arc length interpretation of y(s), one can show 
that (s) is infinitely differentiable for all s € R, though we omit the proof. 

The function sin{=s) has the same period and amplitude as y(s) (check this). If 
we plot y(s) and sin(=s) for 0 < s < 2a, then we get the following graphs: 


Abel’s Function The Sine Function 


The function sin(x) satisfies identities similar to (15.9), but the full theory of sin(x) 
requires sin’ (x) = cos(x) as well. The same is true for ¢(s), where we will use y’(s). 
We will also need the following crucial identity. 


Proposition 15.2.1 Let y(s) be defined as above. Then 
p(s) =1-%(s). 


Proof: Using (15.9) and the periodicity of y(s), you will show in Exercise 2 that it 
suffices to prove that 
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To derive this equation, first observe that (15.8) gives the identity 


e(s) l 
s= | dt, O<s<F. 
0 


Differentiating each side with respect to s and using the Chain Rule and the Funda- 
mental Theorem of Calculus, we obtain 
1 
1 = ————_y'(s), 0<5s<2 
To) y (s) 5 


(be sure you understand why s = 4 is excluded). It follows that 


g'(s) =/1-¢4(s), O<s< 8 


For s = %, note that \/1 — y4(s) vanishes at 2 since y($) =1. Since | is the 
maximum value of y(s) (can you explain why”), we see that y’(s) also vanishes 
at 4. This completes the proof. a 


Other properties of y’(s) will be developed in Exercise 3, and in Exercise 4 you 
will adapt the method used in Proposition 15.2.1 to derive the standard trigonometric 
identity cos?(x) = 1 — sin’(x). 


B. Addition Laws. The addition law for sin(x) states that 
sin(x + y) = sin(x)cos(y) + cos(x)sin(y). 


For y(x), the addition law goes back to Euler, who in 1753 proved the identity 


[et [= [ 
(15.10) o vil-# o v1l—?4 o vl-f?4 
av/1— 84+ BV1—-a4 


1+ 028? 


To state this in terms of ¢, let x, y, and z represent the three integrals in (15.10), so 
that y(x) = a, y(y) = 8, and y(z) = y. Then x+y = z implies that 


_ av l— B4+BV1—a4 

~ 1+ a?f2 , 
which when combined with y(x) = a and y(y) = 6 gives 

(15.11) y(x+y) = HeY reas x)V 1-94) + oly Jv ~ phx) 
1+ y?(x)p?(y) 


Furthermore, \/1 — y4(x) = y’(x) for 0 <x < & by Proposition 15.2.1. Thus 


when a, 3 € [0, 1] and y = 


gplxt+y) =9(z) = 


— lx) (y) + ply)y'(x) 
(15.12) P+) = TT Dey) 
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Rather than use Euler’s result, Abel gave a different proof that (15.12) holds for all 
x,y € R. You will explore Abel’s argument in Exercise 5. 

In Exercise 6 you will use ~(—x) = —y(x) to show that (15.12) implies the 
subtraction law 
ol(x)y'(y) ~ pe") 


p(x-y) = 
1 + ?(x)y?(y) 
When this is combined with (15.12), we easily obtain the identity 
2y(x)y' 
(15.13) plx+y) +y(x-y) = POO) 


1+ p(x)? (y) 
This will be useful later in the section. 


The addition laws give some nice straightedge-and-compass constructions. 


Example 15.2.2 Let us divide the lemniscate into eight pieces of length 220 = 7%. 
Here is a picture of ro = p(#) and the 8-division points: 


This picture and Proposition 15.1.1 show that to construct the 8-division points, we 
need only construct ro. Since y(¥) = 1, the addition law (15.11) implies that 


= 20(F)V1- PF) 2royV/1—r4 


(F) (F 1+y*(#) l+rg 


ald 


Solving this equation in Maple or Mathematica shows that the unique real positive 


solution is given by 
ro = VV V2 — 1 & 643594. 


This is obviously constructible and hence gives the desired construction. <P 


The reasoning behind Example 15.2.2 can be generalized as follows. 
Proposition 15.2.3 If (x) is constructible, then so is p(7?). 
Proof: Setting x = y in (15.12) gives the duplication formula 


(15.14) (2x) = caeee 
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Let ro = v( 4) and a = y(xo). Then (15.14) and y’?(#2) = 1 — p*() imply that 


2 (caetsy _ 4r3(1— rg) 
l+rj (l+ré)2 ~ 


To solve this equation for ro, let t € C satisfy 


7) 
2irG 


(15.15) p=, j=V-1 
1-74 
0 
and observe that 
. ire 
. —);— 
—2it? 2A _ 4ré(1—79) _ 


(15.16) —— = —__ 2. = 
1-1 1— (74,)" (1 +r§) 


_ 4 
I-r 


Solving (15.16) for t? by the quadratic formula shows that t? is constructible because 
ais, and then solving (15.15) for ro completes the proof. rT 


The formulas (15.15) and (15.16) in the above proof seem to come out of nowhere. 
In Section 15.4 we will use complex multiplication and 2 = (1 + i)(1 — i) to “factor” 
the duplication formula for y(2x) into (15.15) and (15.16). So these formulas will 
eventually make perfect sense. 

Here is a slightly more complicated example. 


Example 15.2.4 Dividing the lemniscate into six pieces of equal length gives the 
following picture: 


To compute rp = y(#2) = y( 2), first observe that (15.13) with 2x and x in place 
of x and y gives 


(38) + (8) = Qx-+3) +oQr—s) = AI 


226tehe oS y'(x) 


+9) 


1+ (2221S)? 92x) 
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where the last line uses (15.14). Using y’*(x) = 1 — y4(x) and a bit of algebra, we 
obtain the tripling formula 


8 x 4 x)— 
(15.17) (3x) = -9 09) 


Since y(w) = 0, substituting x = 2 into (15.17) shows that 79 = y(®) satisfies 
rp + 675-3 =0, 


which is easily seen to have the unique real positive solution 


ro = / 2V3 —3 = 825379. 


This is clearly constructible and hence gives the desired straightedge-and-compass 
construction. <-> 


C. Multiplication by Integers. The doubling and tripling formulas 


_ 2(x)o'(x) 
OTe)” 
g(x) + 6p%(x) —3 


989) = PO) Ty Ger) — 39%) 


from (15.14) and (15.17) can be generalized to formulas that express y(nx) in terms 
of y(x) and y’(x) for any positive integer n. 


Theorem 15.2.5 Given an integer n > 0, there are relatively prime polynomials 
P,(u), Qn(u) € Zlu] such that if n is odd, then 


— ole Pale) 
p(nx) _— pl ) On (y4(x)) ’ 
and if n is even, then 
=olx P,, (p*(x) Nx 


Furthermore, Q,(0) = 1. 


Proof: We will prove the theorem by induction on n. Setting P\(u) = Q;(u) = 1 
gives the desired formula for n = 1, For n = 2, note that (15.14) can be written 


0028) = 08) pags #0 
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Thus the theorem holds for n = 2 with P2(u) = 1,Q2(u) = 1+ u. Now assume that it 
holds for n — 1 and n. Using (15.13) with nx and x in place of x and y, we obtain 


p((nt 1)x) = —y((n— 1)x) + _ 29 (nx)p'(x) 


n x)= —{ v(x Prale'te)) 
o((n+1)x) = (w( yt 


Using y’*(x) = 1 — y4(x) and clearing denominators, this simplifies to 


Pati (p*(x)) 


y((n+ 1)x) = v(x) Oni (Oe) 


where 


(15.18) On+1(4) = On—1(u)(Q2(u) + uP?(u)(1—u)) 


and P,41(u) is een by a similar recursive formula (see Exercise 7). It follows that 
P,+41(4), Qn+1(4) € Z[u] by our inductive hypothesis. Note also that Q,41(0) = 1 
follows from Q,(0) = Q,—1(0) = 1. Finally, dividing P,4,(u) and Q,4;(u) by their 
greatest common divisor shows that we may assume that they are relatively prime in 
Z{u]. In Exercise 7 you will show Q,,41(0) = 1 continues to hold, after multiplying 
Pri (4); Qn+i(u) by —1 if necessary. 
The case when n is odd is similar and will be covered in Exercise 7. a 


Theorem 15.2.5 has some nice consequences concerning the division points on 
the lemniscate. The polar distances of the n-division points are 


20 


p(m22), m=0,1,...,n—1. 
When n is odd, the periodicity of y¢ and Theorem 15.2.5 imply that 


P, (4 (m2? )) 
On(p* (m7?) 


so that the polar distance (m2 22) is a root of uP,(u*) when n is odd. In Exercise 8 
you will show that when n is even, the polar distances are roots of uP, (u*)(1 —u?). 
We call these polynomials the n-division polynomials. We have thus proved the 
following corollary of Theorem 15.2.5. 


0= ylm-222) = o(n-m2z) = o(m?Z) 


Corollary 15.2.6 Let n € Z be positive. Then the polar distances of the n-division 
points of the lemniscate are roots of the n-division polynomials defined above. r 
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We also have the following result about straightedge-and-compass constructions. 


Corollary 15.2.7 Let n be a positive integer such that p(22) is constructible. 
(a) (m2 22) is constructible for every m € Z. 


(b) The n-division points of the lemniscate are constructible by straightedge and 
compass. 


(c) [fin addition o( 22) is constructible for a positive integer m, then so is y( 22), 
where N = |cm(n,m). 


Proof: If y(#@) is constructible, then so is y’ (72) since y' (x) = 4/1 — y4(x). 
Part (a) is obvious for n = | and 2, so we may assume that n > 2. 


When m > 0, Theorem 15.2.5 implies that y(m22) is a rational function of y(2) 
and (2 =) with coefficients in Z. In Exercise 9 you will show that the denominator 
is nonvanishing, since n > 2 and the polynomials P,,(), Q.(u) in Theorem 15.2.5 are 
relatively prime. Hence p(m22) is constructible, since the constructible numbers 
form a subfield of C. 

The case m = 0 is obvious, and m < 0 follows from m > 0 because y is an odd 
function. This completes the proof of part (a). 

Part (b) follows immediately from part (a) and Proposition 15.1.1. 

For part (c), let d = gcd(n,m). Then N = Icm(n,m) = %. It follows that if 
integers 1,1 satisfy um + vn = d, then 


20 


By part (a), p(u*=) and (ue) are constructible, and—as above-—the same is 
true for y’ (u22) and yp! (v2). Then the addition law (15.12) expresses p(##) as 
a rational expression with coefficients in Z in the constructible numbers given by the 
values of y and y’ at we 2 and y2z . Since the denominator of this rational expression 
is the nonzero number 


1+ y* (2?) ye" (v22) £0, 
it follows that y( 27) is constructible. a 


Parts (b) and (c) of Corollary 15.2.7 imply that if the n-division points and m- 
division points of the lemniscate are constructible by straightedge and compass, then 
the same is true of the N-division points for N = Icm(n,m). This fact will be useful 
in Section 15.5. 

Here are some applications of Corollary 15.2.7. 


Example 15.2.8 Since y(2m@) = 0, Proposition 15.2.3 implies that o(# 2 \ is con- 
structible for n > 0. Then part (b) of Corollary 15.2.7 shows that the 3” division 
points of the lemniscate can be constructed by straightedge and compass. <p 
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Example 15.2.9 When n = 5, one can show that 
P5(p*(x)) 

Os(p4(x))’ 
Ps(u) = u® +50u> — 125u* + 300u? — 105u? — 62u +5, 
Qs(u) = 1 +50u — 125u?+ 300u? — 105u* — 62u° + 5u® 


p(5x) = v(x) where 


(15.19) 


(see [14, p. 82]). Note the “reverse symmetry” of the coefficients of P;(u) and Qs5(u). 
For the 5-division points of the lemniscate, the discussion preceding Corollary 15.2.6 


20 


implies that ro = y(42) is a root of the 5-division polynomial uPs(u*). Thus 
0 = roPs(r3) = ro(r* + 507° — 12574° + 30074? — 10578 — 6275 +5). 


You will show in Exercise 10 that the real positive solutions are constructible, though 
this is not obvious from the above equation. By Corollary 15.2.7, it follows that the 
5-division points are constructible by straightedge and compass. <> 


This discussion makes it clear that understanding the n-division points of the 
lemniscate is intimately related to the multiplication formula for y(nx). But to 
unleash the full power of these formulas, we will need to extend y to a function of a 
complex variable. This is will be done in the next section. 


Historical Notes 


Although the link between arc length and the lemniscate goes back to Bernoulli, 
the first person to make substantial progress in this area was Fagnano. In 1718 he 
proved the case a = £ of Euler’s addition law (15.10), namely 


o I YO 4 2V1—a4 
2 | —a—edt= | dt when y=. 
o v1l—t4 o vi-?t l+a 


Using this and other results, Fagnano was able to divide one arch of the lemniscate 
into two, three, and five segments of equal length by straightedge and compass. 
Fagnano’s results and methods are discussed in [1]. 

Things got really interesting when Fagnano’s papers were submitted to the Berlin 
Academy as part of his application for membership. Euler was asked to read these 
papers in December 1751, and by 1753 he was able to show that Fagnano’s duplication 
formula was a special case of (15.10). More importantly, he also realized that V1 — 14 
could be replaced by \/ P(t), where P(t) is any separable polynomial of degree 4 with 
real coefficients. This led to the theory of elliptic integrals, which was developed 
at great length by Lagrange and Legendre. Eventually the integrals were put in the 
standard form 


(15.20) |——* 
1 —k2sin?6 
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which after the substitution t = sin@ gives 


1 
15.21 |g 
(9.21) / (oie) 


We call k the modulus, so that { (1 —1*)~!/? dt corresponds to the modulus k = i. 

The first person to consider the inverse function of f(1— rty-V/ 2 dt was Gauss 
in 1797, though this work was not published until after his death in 1855. Abel 
and Jacobi introduced the inverse functions of elliptic integrals in 1827. In Abel’s 
great paper Recherches sur les fonctions elliptiques [Abel, Vol. I, pp. 263-388], he 
considered the inverse function of an elliptic integral of the form 


—S=— =" 
(1 —c*t?)(1 + e217) 


so that c = e = 1 gives the lemniscatic function we’ve been studying. Jacobi, on the 
other hand, used the integral (15.20) and wrote its inverse function as 6 = amu. Thus 
sin@ = sinamu is the inverse function of (15.21). These days, we write sinamu as 
sn(u,k), or simply sn(u) if the modulus is understood, though Mathematica writes 
sn(u,k) as JacobiSN[u,k?]. In the text, we used JacobiSN[u, —1] to draw the graph 
of the lemniscatic function y(u) = sn(u, i). 

One of the critical discoveries of Gauss, Abel, and Jacobi is that inverse functions 
of elliptic integrals are doubly periodic functions of a complex variable. We will 
consider a special case of this in the next section. More on the history of elliptic 
integrals can be found in [1], [7, pp. 3-16], and [12, pp. 267-268]. A nice introduction 
to the duplication formula (15.14) appears in [16]. 


Exercises for Section 15.2 


Exercise 1. Give a careful proof of (15.9) using the hints given in the text. 
Exercise 2. Supply the details needed to complete the proof of Proposition 15.2.1. 


Exercise 3. Here are some useful properties of y'(s). 

(a) y(s) has period 2~. Explain why this implies that the same is true for y’(s). 
(b) p(s) is an odd function by (15.9). Explain why this implies that y’(s) is even. 
(c) Use (15.9) to prove that y’(a@ —s) = —y’(s). 

(d) Use Proposition 15.2.1 to prove that y"’(s) = —2p3(s). 


Exercise 4. Suppose that we define sin(x) by y = sin(x) => x= [?(1-#?)~'/? dt. Then 
define cos(x) to be sin’(x). Use the method of Proposition 15.2.1 to prove the standard 
trigonometric identity cos?(x) = 1 — sin?(x). 


Exercise 5. Here is Abel’s proof of the addition law for y. 
(a) Let g(x, y) be differentiable on R’, and set h(u,v) = g(4(utv), 4 (u—v)). Use the Chain 
Rule to prove that 


Oh 1 Og 


By (¥) = 25, (3(u+¥),3(u—¥)) -4 ($(ut+yv),$(u—v)). 


QP 


THE LEMNISCATIC FUNCTION 481 


(b) Use part (a) to show that g(x,y) = g(x+-y,0) on R? if and only if oe = 3 on R’. 
(c) Prove the addition law for y by applying part (b) to 


_ oxy’) + 90) ¢'@) 
8 (x 5] y ) _ 2 2 . 
1+ p(x) 9? (y 
Part (d) of Exercise 3 will be useful. 
Exercise 6. Show that the subtraction law 


g(x)y'(y) — ey) ¢'(x) 
1+ 7(x)y?(y) 


follows from the addition law together with (15.9) and Exercise 3. 


y(x—y) = 


Exercise 7. The proof of Theorem 15.2.5 uses induction on n. 

(a) Assume that 7 is even. In (15.18), we gave a formula for Q,41(u) in terms of Q,,(u) and 
Qn—1(u). Derive the corresponding formula for P,+1(u). 

(b) Suppose that polynomials P,(u),Q,(u) satisfy all of the conditions of the theorem ex- 
cept for the requirement that they be relatively prime. Since Zu] is a UFD, we can 
write Py(u) = Cn(u)P,(u), Qn(u) = Cy(u)Qn(u), where Cn (u), Pa(u), On(u) € Z[u] and 
P,(u),Q,(u) are relatively prime. Prove that we can assume that Q,(0) = 1 and that 
P,(u),Qn(u) satisfy all conditions of Theorem 15.2.5. 


(c) Complete the inductive step of the proof when n is odd. 


Exercise 8. Let n be even, and let P,(u) be the polynomial from Theorem 15.2.5. Complete 
the proof of Corollary 15.2.6 by showing that the polar distances of the n-division points of 
the lemniscate are roots of uP,(u*)(1—u’). 


Exercise 9. This exercise is concerned with the proof of Corollary 15.2.7. 
(a) Suppose that P(u),Q(u) € Z[u] are relatively prime and Q(0) = 1. Prove that uP(u*) and 
Q(u*) have no common roots in any extension of Q. 


(b) Fix xin R and m > 0 in Z, and let Pa(u),Qn(u) € Z[u] be as in Theorem 15.2.5. Thus 
(mx) Om(y*(x)) = p(x) Pm(y*(x)). Prove that On(y*(x)) # 0 when v(x) # 0. 
(c) Show that o( 22) # 0 when n > 2 is in Z and conclude that Qn (y*(7#2)) #0. 


Exercise 10. The polar distances of the S-division points of the lemniscate satisfy the equation 
0 = ro(rp* + 50r6° — 125 74° + 300r9” — 10578 — 62r5 +5). 


This equation was first derived by Fagnano in 1718. 
(a) Show that the ro corresponding to the 10-division points also satisfy this equation. 
(b) Use Maple or Mathematica to show that this equation factors as 


0= ro(ré —2rg+ 5)(ro° +52ri? — 26r8 — 1274 + 1) 


and that the only positive real solutions are 


\/ -13 +.6V5 421/85 —38V5. 


Explain (with a picture) how these solutions relate to the 5- and 10-division points. 
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Exercise 11. Use sin(x + y) = sinxcosy+ sinycosx to show that if a, 8 € [0, 1], then 


f ! at [ea f° ! dt 


where 7¥ is the real number defined by 


vy=avV1—6?+8V1-a?. 


Note the similarity to (15.10). 


Exercise 12. Show that the substitution t = sin@ transforms (15.20) into (15.21), and use this 
to prove carefully that (wv) = sinam(u) when the modulus is k = i. 


15.3 THE COMPLEX LEMNISCATIC FUNCTION 


Corollary 15.2.6 implies that the polar distances r = y (m2) of the n-division points 
of the lemniscate are roots of the n-division polynomials. To prove Abel’s theorem 
on the lemniscate, we need to represent all roots of these polynomials using y. Since 
many of the roots are complex, this requires that we follow Gauss and Abel and 
extend vy to a function defined on C. 

Abel began by considering (iy) for y € R. We know that r = y(y) is the inverse 
function of y = (1 —#*)~!/2 dt. The change of variables t = iu shows that 


ir 1 r 1 
aaa =i | ——— du = iy. 
[ vl-#4 0 Vl—u4 » 
This suggests that y(iy) can be defined to be y(iy) = ir = ig(y). Then Abel used the 
addition law to define y(x-+ iy) as 
ela)y' (iy) + plivg' (x) 
1+ ?(x)p?(iy) 


Since (iy) = ip(y) easily implies that y’ (iy) = y’ (y) (see Exercise 1), the formula 
for y(x+ iy) simplifies to 


y(x+iy) = 


_ yy  Plx)e'(y) +iplye'(x) 


To make Abel’s approach rigorous, we will define y(z) using (15.22). Over 
R, (x) is periodic and defined everywhere; over C, we will see that y(z) is doubly 
periodic and has poles. The properties of y(z) will play a crucial role in Sections 15.4 
and 15.5. 

This section will assume familiarity with standard topics from complex analysis, 
including the Cauchy—Riemann equations and Laurent series. We will refer to [13], 
though the results we need are in most introductory texts on the subject. 


A. A Doubly Periodic Function. As above, we define y(z) = v(x + iy) using 
equation (15.22). Here are some basic properties of this function. 
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Proposition 15.3.1 The function p(z) satisfies the following: 
(a) y(z) is analytic for all z # (m+ in) ¥, m,n odd. 
(b) The addition law 


o(z)p'(w) + p(w) ye" (2) 
1+ y?(z)y?(w) 
holds for all z,w € C such that both sides are defined. 

(c) Forz€ Cand m,n € Z, we have 


p(z+w) = 


g(zt+ma +noi) = (-1)"*"9(z). 


Proof: First observe that (p(z) is defined whenever the denominator 1 — y?(x)y? (y) 
| 


) 
for all x € R, with equality if and only if x is an odd multiple of |. Hence ¢(z) is 
defined on the open set 2 = {z € C| zF (m+ in) ¥, m,n € Z odd}. 

Write y(z) = y(x+iy) = u(x, y) +iv(x, y), where u(x, y) and v(x, y) are the real and 
imaginary parts of the right-hand side of (15.22). It is easy to see that u(x, y) and v(x, y) 
are differentiable on © as functions of x,y, since y(x) is infinitely differentiable on 
R. Furthermore, using the identity y’?(x) = 1 — y*(x) for x € R, it is straightforward 
to verify that u(x,y) and v(x, y) satisfy the Cauchy—Riemann equations 


Ou_ a Ou __ a 
Ox dy’ dy Ox 
(see Exercise 2). By [13, 1.5.8], it follows that y(z) is analytic on Q. 
For part (b), let z and w be complex variables, and define 


a(z,w) = (z)p'(w) + plw)e'(z) 
1 + p?(z)y? (w) 

When x € R is fixed, y(xo + w) and g(xo,w) are analytic in w and coincide when 
w € R by the addition law (15.12). By the Identity Theorem [13, 6.1.1], o(x + w) = 
g(xo,w) for all w such that both are defined. It follows that when wo € C is fixed, 
p(z+wo) and g(z, wo) are analytic in z and coincide when z € R and both are defined. 
Using the Identity Theorem again, we see that y¢(z-+ wo) = 9(z, Wo) for all z such that 
both are defined. Since wo € C is arbitrary, this proves the addition law. 

The proof of part (c) requires a series of facts about y(z) and y’(z) that you will 
verify in Exercise 2. We begin with the following table of values: 


(15.23) 


(15.24) pliz) = ip(z), 
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Earlier, these identities were “proved” in order to motivate (15.22). Here, they are 
instead rigorous consequences of (15.22). 
Since y and y’ have period 2@ on R, (15.23) and (15.24) easily imply that 


(15.25) 


Using the addition law, it is now straightforward to show that 

(15.26) p(z+mm) =(-1)"~(z) and y(zt+nai) = (-1)"¢(z) 

for m,n € Z. The desired identity for y(z + mw + ni) follows immediately. a 
Part (c) of Proposition 15.3.1 implies that y is doubly periodic: 

(15.27) y(z) = o(z+ (1+)@) = (z+ (1-i)a). 


Note that the periods (1 +i)@ and (1 —i)@ are linearly independent over R. The 
picture is the following: 


2ai 


(15.28) 


The dots in this picture are the complex numbers in the set 
L={(m+ni)o|m+n=0 mod 2} = {m(1+)ot+n(1-iw|mneZ}. 


This is the period lattice of y. Double periodicity means that once we know the 
values of p(z) for all z in one of the tilted squares, we know its values for all z € C. 


B. Zeros and Poles. Our next task is to study the zeros and poles of y(z). Recall 
that zo € C is a simple zero of an analytic function g(z) if g(zo) =0 and g’(z) £0. 
This is equivalent to saying that the power series expansion of g(z) at zo is 


oo 


g(z) =ai(z—zo) +} an(z—z0)", a1 FO. 


n=2 
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As defined in [13, 3.3.2], zo is a simple pole of a meromorphic function g(z) if the 
Laurent expansion of g(z) at 2 is 


(z—z)", a1 #0. 
n=0 


Theorem 15.3.2 (z) is meromorphic on C with the following zeros and poles: 
(a) The zeros are all simple and occur at z= (m+ in)a for m,n € Z. 
(b) The poles are all simple and occur at z = (m+ in)# for m,n odd. 


Proof: Since y(0) = 0 and vy’ (0) = 1, part (c) of Proposition 15.3.1 easily implies 
that » has a simple zero at (m+ in)w for all m,n € Z. 
Using the addition law together with (15.23) and (15.24), we see that 


zy Bol (z ! 
glet 2) = POY (f)+ AG) _ 2) 
1+ ¢°(z)¢?(F) 1+7(z) 
Similarly, 
; ACS) 
+ 2j24j—~7" 
Pee) == 2 
(see Exercise 3). Multiplying these two equations gives the remarkable identity 
; ¢'(z) )( _ ¥'(2) ) ; 
+2 + Zi) = | ——— } | +£i— >] = +3, 
Pat Zee FH) (ke ‘T—#@) 


since y'*(z) = 1 — y*(z). 
Replacing z with z+ F and using y(z+ w) = —y(z) (prove this), we obtain 


(15.29) y(z) o(z+ (1 £1) 2) = Fi. 
Now assume (zo) = 0. Then (zo +(1+i )¥)i is undefined by (15.29). Hence 
zwot+(1+i)F =(m+in)\F, m,nodd 


by Proposition 15.3.1. It follows easily that z) is one of our known simple zeros. 
To analyze the poles of y, we write (15.29) as 


. Fi 
p(zt+0+)2) = —. 
(+493) = 
Since p(z) has a simple zero at z = 0, we see that has simple poles at z= (1 +i). 
Using the double periodicity of y, we conclude that ~ has simple poles at (m+ in 


)F 
for m,n odd. Then we are done, since these are the only possible singularities of y 
by Proposition 15.3.1. a 


Our next result will play an important role in the next section. 
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Theorem 15.3.3 Fix a complex number wo. Then the equation :p(z) = wo has a 
solution z € C. Furthermore, if zo is one solution, then all solutions are given by 


z=(-1)""w+(m+in)w, mane Z. 


Proof: Let g(z) be analytic in a region 2 C C, and let C C 2 be a simple closed 
curve, oriented counterclockwise. The Zero—Pole Theorem [13, 6.2.1] says that if 
g(z) has no zeros or poles on C, then 


1 feg'(2) 
— | 2 g=z-P, 
2ni Jc g(z) 2 


where Z is the numbers of zeros of g(z) inside C, counted with multiplicity, and P is 
the numbers of poles of g(z) inside C, also counted with multiplicity. 

The function g(z) = p(z) — wo has the same poles as y, which are (m+ in)?, 
m,n odd, by Theorem 15.3.2. This means that we cannot use the tilted squares from 
(15.28). However, since the zeros of g(z) are isolated (see [13, 6.1.2]), we can shift 


one of the squares to the left as pictured below to obtain a curve C such that g(z) has 
neither zeros nor poles on C: 


(15.30) 


The open circles are poles of g(z) and are simple by Theorem 15.3.2. Exactly two 
lie in the interior of C, so that P = 2. 

Since g(z) = y(z) — wo has periods (1 4i)w, the same is true for g’(z) and 
g'(z)/g(z). Opposite edges of C differ by (1 £1)m, so that g’(z)/g(z) takes the same 
values on opposite edges. Hence the integrals along opposite edges cancel, since 
they have opposite orientations. This gives 


1 f g'(z) 
Z—-2=Z—P=— | 2“az=0. 
2mi Je g(z) “ 


We conclude that inside C, g(z) = y(z) — wo has either two simple zeros or one double 
zero. In particular, ¢(z) = wo must have a solution 2 inside C. 
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From zo and m,n € Z, Proposition 15.3.1 gives the additional solution 
p((-1)"*"zo + (m+ in)or) = (—1)"*"p((-1)"*"z0) = (20) = wo, 


where the second equality follows since y is odd. We must show that there are 
no other solutions. Let D be the region enclosed by C (including the boundary). 
Translating D by elements of the period lattice & = {(m+ni)w | m-+n is even} 
covers the entire complex plane. In particular, —z) + w has a translate by & that lies 
in the interior of D, i.e., there are m,n € Z with m+n even such that 


(15.31) —z +a +(m-+ in) = (-1)"t"F!29 + (m+ 1)+in)w 


lies inside the curve C. If (15.31) differs from zo, then we have found all zeros of 
g(z) = y(z) — wo inside C. Since every other zero has a translate by & that lies inside 
C, it follows that all solutions of y(z) = wo have the desired form. Finally, if (15.31) 
coincides with zo, then it is easy to see that 


zo =(at+ib)$, abe Z, a+bodd. 


In Exercise 4 you will show that this implies that g’(zo) = y’(zo) =0. By what we 
proved above, it follows that zo is the only zero of g(z) inside C. As before, we 
conclude that the solutions have the desired form. 2 


Mathematical Notes 
Two ideas implicit in this section require further comment. 


= Elliptic Functions. By Proposition 15.3.1, y is a meromorphic function on C 
with periods (1 +i)a, (1 —i)m@ that are linearly independent over R. In general, an 
elliptic function is a meromorphic function on C with periods w1, w2 that are linearly 
independent over R. While the basic ideas of elliptic functions go back to Abel and 
Jacobi, these days most texts follow the approach of Weierstrass, who defined the 
Weierstrass ¢-function to be 


1 1 1 
9(Z3W1,W2) = + (—___, ——__). 
2 »» (z— (nw, +mw))° (nw +m) 
(m,n) A(0,0) 


For example, if we let ¢)(z) denote the g-function with periods (1+ iw, (l-—fw 
(this is the notation of [15]), then one can show that 

g1(z) _ 4e1(%)-1 

9, (z) 403(z) +1 


(15.32) y(z) = —-2 and y’(z) 


Furthermore, the relation 


translates into the relation 


(15.33) 17(z) =41(z) + e1(z). 
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In general, the g-function ¢(z) = o(z;w1,W2) satisfies 


(15.34) g!? (z) = 49 (z) — g2.@(z) — 83, 


where g2 and g3 are constants determined by the periods w,,w2. There is also an 
addition law for o(z+ w). Introductions to elliptic functions can be found in [3, §10], 
[9, Ch. 3], [12, Sec. 8.3], [14, Ch. 2], [16, Ch. 1], and [20, Ch. 9]. 


= Elliptic Curves. The primary geometric object of this chapter is the lemniscate, 
which is the curve defined by the equation (x? + y”)* = x? — y?. However, the elliptic 
functions we’ve been studying lead to other curves of interest. For example, the 
relation 

g?(z) =1—¢%(z) 


shows that the map z++ (y(z),y’(z)) parametrizes the curve y? = 1 — x‘. Similarly, 
the relation (15.33) for the Weierstrass ¢-function go1(z) shows that z+ (¢1(z), 94 (z)) 
parametrizes the curve 

y? = 4x3 +x, 


and for a general go-function, (15.34) shows that z'> (g(z), @’(z)) parametrizes 
y? =4x° — gox— gz. 


These are elliptic curves. They have an intrinsic addition law compatible with the 
addition law of the g-function. Some of the most important theorems and conjectures 
of modern number theory involve elliptic curves. Introductions to this wonderful area 
of mathematics can be found in [8], [10], [14], [18], and [20]. 


Historical Notes 


In the Historical Notes to Section 15.2, we saw that Abel’s theory of elliptic 
functions began with the integral 


/ tly, 
(1 —c212)(1 + e212) 


He denoted the inverse function by y(x) and then defined f(x) = \/1 —c??(x) and 
F(x) = 1 + ey?(x). These functions are related via y’ (x) = f(x)F (x). Abel gave 
addition laws for p(x), f(x), F(x) and multiplication formulas for y(nx), f(nx), 
F(nx) similar in spirit to Theorem 15.2.5. He also extended these functions to 
functions of z € C and determined their periods, zeros, and poles. Abel’s paper 
[Abel, Vol. I, pp. 263-388] contains many wonderful formulas and is fun to read. 
Jacobi developed a similar theory based on the integral (15.21). He defined 
functions sinamx, cosamx, and Aamzx, later simplified to sn(x), cn(x), and dn(x). 
His version of the theory became very influential, though it was eventually superseded 
by the g-function introduced by Weierstrass in 1882. One nice result of Weierstrass 
is that every elliptic function with the same periods as go(z) is a rational function in 
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go(z) and g’(z). So once the period lattice is fixed, only two elliptic functions are 
needed in order to get all others. 
Gauss anticipated most of the work of Abel and Jacobi on elliptic functions but 
never published his results. As he wrote in 1828 [Gauss, Vol. X.1, p. 248], 
Ishall most likely not soon prepare my investigations on transcendental functions 
that I have had for many years—since 1798—because I have many other matters 
that must be cleared up. Herr Abel has now, I see, anticipated me and relieved 
me of the burden in regard to one third of these matters, particularly since he 
carried out all these developments with great concision and elegance. 


What led Gauss and Abel to work over the complex numbers? It appears that they 
were inspired to define y(z) for z € C in order to represent all roots of the n-division 
polynomials of the lemniscate. The high degree of these polynomials suggests that 
the roots cannot be all real. In the next section, we will use the theory of complex 
multiplication to describe the roots of the n-division polynomials. 

More on the history of elliptic functions can be found in [3] and [7]. A classic 
treatment of the Jacobian elliptic functions appears in [21]. 


Exercises for Section 15.3 

Exercise 1, Suppose that g(z) is an analytic function satisfying g(iz) = ig(z). Prove that 
g (iz) = 8'(2). 

Exercise 2. This exercise is concerned with the proof of Proposition 15.3.1. 


(a) Prove that (x + iy), as defined by (15.22), satisfies the Cauchy—Riemann equations. 
(b) Prove (15.23), (15.24), (15.25), and (15.26). 


Exercise 3. Prove the formula for y(z+ $i) stated in the proof of Theorem 15.3.2. 
Exercise 4. Prove that y’(z) vanishes at all points of form (m+ in)#, m+n odd. 


Exercise 5. A useful observation is that an identity for » proved over R automatically becomes 
an identity over C. 

(a) Prove this carefully, using results from complex analysis such as [13, 6.1.1]. 

(b) Explain why y’?(z) = 1 — y*(z) holds for all z € C. 


Exercise 6. By Theorem 15.3.3, y(z) = ¢(zo) if and only if z= (—1)"*"zo + (m+ in)w. 
Following Abel, prove this using (15.13). 


15.4 COMPLEX MULTIPLICATION 


By Exercise 5 of Section 15.3, the multiplication formulas for y(nx), x € R, extend 
to give formulas for (nz), z € C. Over C we also have the formula (15.24) given by 


(15.35) y(iz) =ip(z), i=V-1. 


So besides multiplying by n € Z, we can also multiply by i. Combining these with 
the addition law gives formulas for y((n+ im)z), where n+ im € Z[i] is any Gaussian 
integer. In other words, y(z) has complex multiplication by Z]i). 
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Before developing the general theory, let us give an example to illustrate the power 
of complex multiplication. 


Example 15.4.1 In Exercise 1 you will use the addition law together with (15.35) 
and y(—z) = —y(z) to prove that 


o((1-+ a2) = Lee, 
(15.36) (i- doldy'(2) 
(92) = Re 


These are simple examples of complex multiplication. 
To see the relevance of (15.36), square each side and apply y’?(z) = 1 — y4(z). 
This gives 


- 2 
(409) = ey 
(15.37) “aig(e 
#92) = TG) 


The surprise is that we’ve seen disguised versions of these formulas in the proof of 
Proposition 15.2.3. To explain why, let ro = yp(%2) and a = (xo) as in the proof. 
Then set t = y((1+i)%) and apply the first formula of (15.37) to obtain 

2 _ _2ir6 


Since 2 = (1 —i)(1 +4), the second formula of (15.37) implies that 

—2ip? ((1+i)2) _ —2it? 
i-g((1+)%) 1-7 

The above two equations are (15.15) and (15.16) from the proof of Proposition 15.2.3. 


Earlier, they seemed to appear out of nowhere, but now that we know complex 
multiplication, they are no longer so mysterious. <p 


a? = p(x) = "(1-1 +i)¥) = 


The proof of Proposition 15.2.3 used the duplication formula for y(2x). Exam- 
ple 15.37 shows that factoring 2 in Z[i| enables us to factor the duplication formula 
into equations that are simpler to understand. We will use similar factorizations in 
Section 15.5 when we prove Abel’s theorem on the lemniscate. 

The theory of complex multiplication gives formulas for p(8z), where z € C and 
G =n+im € Zi] is a Gaussian integer. In this section we will first review some 
basic facts about Z[i] and then derive formulas for (Gz), paying special attention to 
the case when @ is prime in Z[i]. 


A. The Gaussian Integers. The ring of Gaussian integers is defined by 


Zii] = {a+ ib| a,b € Z}. 
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The units of Z[i| form the group Z[i]* = {+1, 4i} = {i* |e = 0,1,2,3}, and nonzero 
Gaussian integers a, 3 are associate if a = i£ 8 for some i* € Z[i]*. Furthermore, 
Z(i] is a UFD with the following primes (up to associates): 
e 2=(1+)(1 —2), where 1 +i and 1 —i are associate primes in Z[i]. 
e When p =3 mod 4 is prime in Z, p is also prime in Z|]. 
e When p = 1 mod 4 is prime in Z, there are a,b € Z such that p = a? +b? = 
(a+ bi)(a — bi), where a + bi and a — bi are nonassociate primes in Z[i]. 
Also, Z[i] is a PID, so that every ideal is of the form Z[i] for some 6 € Z[i]. All of 
these facts are proved in most books on abstract algebra. See, for example, [Herstein, 
Sec. 3.8]. 
Given a € Z[i], we say that @ = y mod a if a divides 3 — yin Z[i]. To understand 
the quotient ring Z[i]/aZ/i], recall that a = a+ ib € Z[i] has norm 


N(a) =a@=|a\* =a +b° EZ 
such that N(aZ) = N(a)N(8). Then we have the following result. 


Lemma 15.4.2 Let a be a nonzero element of Z|i|. Then: 
(a) Z|i]/aZ|i] is a finite ring with N(a) elements. 
(b) If a is prime, then Z|i|/aZ|il is the finite field 


Zli|/aZ |i} ~ Fy(a)- 
Proof: You will prove this in Exercises 2 and 3. 7 


We say that a Gaussian integer a+ bi € Z[i] is odd if a+ bis odd and even if a+b 
is even. If a, 8 € Z[i], then 


af isodd = a and § are odd, 
(15.38) a+ iseven a, are both even or both odd, 
ais even & | +i divides a 


(see Exercise 4). Since 1 + iis prime in Z[i], the last line of (15.38) can be stated as 


ais odd & 1 +i and o are relatively prime. 


B. Multiplication by Gaussian Integers. When n € Z, Theorem 15.2.5 ex- 
presses (nz) in terms of ¢(z) when n is odd and in terms of y(z) and y'(z) when n 
is even. Here, we will generalize on the former case by giving formulas for (z) in 
terms of y(z) when £ € Z|i] is odd. 

In one sense, the formulas are easy—the proof of Theorem 15.4.4 given below 
shows that they are simple consequences of the addition law, the multiplication 
formulas for ¢(nz) from Theorem 15.2.5, and the identity (iz) = ig(z). However, 
in order to prove Abel’s theorem on the lemniscate, we need to understand the fine 
structure of these formulas. 

Here is an example to illustrate the issues involved. 
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Example 15.4.3 In Exercise 5 you will use the addition formula to derive the formula 


(—2+ i)p®(z) —6ip*(z)+2+i 
5y8(z) —2p*(z) +1 


The numerator and denominator have a common factor that can be canceled. In 
Exercise 5 you will show that this leads to the simpler formula 


y((2+4)z) = 9(z) 


p*(z) + (-1+4 21) 


(15.39) e(2+i2) = angie 


We pulled out a factor of —i to ensure that the numerator is monic and the denominator 
has constant term 1. Note also the “reverse symmetry” of the coefficients of numerator 
and denominator. This will be important below. <p> 


The following theorem generalizes the formula (15.39) for y((2 + i)z). 


Theorem 15.4.4 Let 8 € Z[i] be odd. Then there exist relatively prime polynomials 
Pg(u),Qg(u) in the polynomial ring Z{i][u] and e € {0,1,2,3} such that: 
(a) For allze€C, we have 


(Bz) = (2) AY 


(b) 6 =i* mod 2(1 +2). 

(c) Pg(u) and Qg(u) have degree d = (N(B) —1)/4, where N(3) is the norm of 8. 

(d) The roots of the 3-division polynomial uPg(u*) are the complex numbers p (oF) 
for a € Zi] odd. 

(e) Pg(u) is monic, Qg(0) = 1, and Qg(u) = u4Ps(1/u), where d is from part (c). 


Before beginning the proof, let us explain what the theorem says about y(z) 
when 6 € Z[i] is odd. Let Pg(u),Qg(u) € Zii][u] be the polynomials given in the 
theorem. Parts (c) and (e) imply that Pg(u),Qg(u) can be written in the form 

Pg(u) = ut +ayut—!+---+.ag, 
Qp(u) = u*Ps(1/u) 
= u4((1/u)4 +a)(1/u)4! + -+++a4) 
=It+aut+-:-+aqu’, 
where d = (N(8) — 1)/4 and aj,...,aq € Z[i]. This is the “reverse symmetry” 
mentioned above. Then the complex multiplication formula for y(8z) can be written 


p"4(z) +aip*4-4(z) +--+ +49 


1182) = EO) TE aha) e+ a(R) 


J 


where 6 = i* mod 2(1 + i) by part (b). Here is an example. 
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Example 15.4.5 Suppose that 6 = 2+i. Since d = (N(8) —1)/4 = (5-1)/4=1 
and 8 = —i mod 2(1 +3), the above formula reduces to 


4 
. . p*(z) +a; 
2 =— See 
y((2+ i)z) = —ip(z) noel’ 
where a, € Z[i]. Comparing this to (15.39), we see that a; = —1 + 2i. <P 
The following lemma will be useful in the proof of Theorem 15.4.4. 


Lemma 15.4.6 Let 8 € Z[i] be odd. Then the set 
Re = {p(z) |z EC, (Bz) = 0} 


has precisely N(8) elements and consists of all complex numbers of the form 
y(aF), a €Z[i] odd. 
Proof: First observe that if a € Z[i] is odd, then p(a-F) € Ra, since p(B-aF) = 


(aw) = 0, where the last equality is by Theorem 15. 3.2. Going the other way, 
suppose that y(8z) = 0. Then Theorem 15.3.2 implies that 


Bz=(atib)w, a,beZ. 


Let a=a+ib€ Z{i}. Then z= a, so that y(z) = y(aF). If a is odd, then we 
are done. On the other hand, if a is even, then 8 —a is odd. Using the identity 
p(w — z) = y(z) from (15.9), we obtain 


9((8—2)F) = 9(w-aF) = v(aF). 
This shows that the elements of Rg have the desired form. 
To determine the size of Ra, fix (a) € Rg, where a € Zi] is odd. We claim 
that a is unique modulo GZ/i]. To see why, suppose that 
y(aF) = y(aF), a, a@ € Z[i] odd. 
By Theorem 15.3.3, there is a + ib € Z[i] such that 
aF = (-1)*’aF +(a+ib)o 
This implies that 
& = (-1)**"a + (a + ib) £. 
Since a, @, and are odd, a + ib is even by (15.38). Thus (—1)¢+® = 1 and hence 
&= a+ (a+ib)B, 


so that a and @ give the same element of Z[i] /BZ[i]. Since every coset of Z[i]/GZ{i] 
can be represented by an odd Gaussian integer (given any a, either a or a+ is 
odd), it follows that 


IRa| =|Z[i]/SZ|i]|=N 


where the last equality follows from Lemma 15.4.2. a 
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We now turn to the proof of the theorem. 
Proof of Theorem 15.4.4: We will prove the theorem in five steps. 


Step 1: Existence of Pg(u) and Qg(u) for all 8. Given 8 € Z[i], we claim that there 
are polynomials Pg(u),Qa(u) € Z[i][u] such that Q,(0) = 1 and 


(15.40) p(8z) = v(z) wee when ( is odd, 
and 
(15.41) (Bz) = v(z) ee y'(z), when £ is even. 


We will prove the formulas (15.40) and (15.41) using the multiplication formulas 
from Theorem 15.2.5 together with the identities 


p(iz) = ip(z), 
_ U+)ee eto 
(15.42) 20( Bay" (z 


y((8+1)z) =—¢((6- 3) Tote” 
) 


Ye) = —p(( 2) + 220808) 
(8+ iz) = (8-2) + See 


We already know the first and second lines, and the third and fourth lines follow from 
the first and (15.13) (see Exercise 6). 

The formulas for y(iz) and y((1 +i)z) from (15.42) satisfy (15.40) and (15.41). 
From here, repeated use of the third line of (15.42) shows that for all integers n > 0, 
there are polynomials P,4;(), Qn4:(u) € Z[il[u] that give the desired formula for 
p((n + i)z). The argument is similar to what we did in the proof of Theorem 15.2.5. 
In particular, when 7 is even, we get the recursion 


On+1+i(U) = Qn—1+i(¥) (Qe 4i(u) + uP ey (u)(1 —u)) 
similar to (15.18). This makes it easy to show that Q,,:(0) = 1 for all n > 0 even, 
and the argument that Q,4;(0) = 1 for n > 0 odd is similar. 

Now fix an integer n > 0. We have formulas for y((n + i)z) (just proved) and 
(nz) (by Theorem 15.2.5). Then repeated use of the fourth line of (15.42) shows 
that for all m > 0, there are polynomials P,+im(u),Qn+im(u) € Z[i|[u] that give the 
desired formula for y((n+ im)z) and satisfy Q,4im(0) = 1. See Exercise 7. 

Hence we have formulas for p((n + im)z) for all integers n,m > 0. Then 


(om + in) z)=(i(nt+im)z) = ip((n+im)z), 
(15.43) y((—n —im)z) = y(—(n+im)z) =—y((n+im)z), 
y((m — in)z) = p(-i(n+ im)z) = —ip((n+ im)z) 

make it easy to construct the desired Pg(u),Qg(u) € Z[i][u] for all 6 € Zi]. 
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Step 2: Remove Common Factors. For the rest of the proof, we will assume that 2 
is odd. The polynomials Pg(u),Qg(u) constructed in Step 1 might have a common 
factor. Since Z[i] is a UFD, the same is true for Z[i][u] by Theorem A.5.6. Thus 


Pg(u) =Cg(u)Pa(u) and Qp(u) =Cp(u)Qa(u), 
where Cg(u),P3(u),Q,(u) € Zli][u] and Ps(u),Qg(u) are relatively prime. Since 
Qg(0) = 1, we can multiply Ca(u), P3(u), Qa (u) by suitable units in Z[i]* = {+1, +7} 
so that Oz (0) = 1. Since £ is odd, we have 
Ps (p*(z)) Ca (v*(z))Pa(y*(z)) _ _, Pa (v*(z)) 


1) = 91) Osi) Cot) Os(e"@) Os) 


Hence we may assume Pg (u),Qg(u) are relatively prime in Z[i][u] with Q3(0) = 1. 


Step 3: The Constant i®. In Exercise 8 you will show that (Z[i]/2(1 +i)Z[i])” = 
{+[l],+[i]}, so that 8 = i* mod 2(1 +) for some ¢€ € {0,1,2,3}. Multiplying Ps (u) 
by a suitable unit of Z[i]*, we obtain the equation 


e_ Paly*(z)) 
15.44 = iF +. 
(15.44) p(Bz) = i (z) 03(-"(2) 
In Exercise 8 you will also show that 
(15.45) p(B¥) =F. 


This will be useful later in the proof. It follows that the relatively prime polynomials 
P3(u),Qg(u) € Z[i][u] satisfy parts (a) and (b) of the theorem together with the 
condition Qg(0) = | from part (e). Steps 4 and 5 will show that Ps(u),Q,(u) satisfy 
the remaining conditions of the theorem. 


Step 4: The Roots of uP (u*). We will use Lemma 15.4.6 to determine the roots of 
the 8-division polynomial Ag(u) = uPg(u4). Also let Bg(u) = Qg(u‘). Since 8 is 
odd, (15.44) implies that 


(15.46) p(6z) = ee. 


In Exercise 9 you will show that Ag(u) and Bg(u) have no common roots in C, since 
Q(0) = 1 and Pg(u),Qg(u) are relatively prime in Z[i|[u]. Using this and (15.46), 
it follows that 

Aa(y(z)) =0 => (Bz) =0. 


Since any root of Ag(u) is of the form y(z) for some z € C by Theorem 15.3.3, we 
conclude that the roots of Ag(u) form the set 


Re = {y(z) |z€C, y(Bz) = 0} 
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from Lemma 15.4.6. Then the lemma implies that the roots can be written in the 
form described in part (d) of the theorem. 

We next show that all roots of Ag() have multiplicity 1. Assume that uo = (zo) 
is a multiple root. Then Ag(ug) = Aj, (uo) = 0, and hence Bg (uo) # 0 by the previous 
paragraph. Differentiating (15.46) with respect to z and substituting z = zp gives 


Bg(uo)Azg (uo )¢" (zo) — Big(uo Ag (ua) ¢' (zo) 
Bg(uo)? 


(note that yp’ (zp) is defined because (zp) is). Since y(6z) = 0, y has a multiple 
zero at 82. This is impossible by Theorem 15.3.2. Hence Ag(u) has simple roots. 

We conclude that the degree of Ag(u) is the number of elements in Rg. By 
Lemma 15.4.6, it follows that Ag(u) = uPg(u*) has degree N(3), so that Pg(u) has 
degree d = (N(G) — 1)/4. This proves part (c) for Pg (u). 


y’ (Bz) B =i =0 


Step 5: Relate Pg and Qs. Once we show that 
(15.47) QOp(u) =u*Ps(1/u), d = (N(B)—1)/4, 


it will follow immediately that Qg(u) has degree d and Pg(u) is monic (since Qg(u) 
has constant term 1). Thus we need only prove (15.47) to complete the proof. 
The identity (15.29) implies that 


g(zjp(zt(1+i#) =-i=2. 
Setting w = z+ (1+2)¥, we obtain 
(15.48) y(z)y(w) =i. 
In Exercise 10 you will use (15.48) and 8 = i* mod 2(1 +) to show that 
(15.49) y(Bz)p(Bw) = B+. 


Then 


(Bz) _ p(w) _ Qa(y*(w)) — Qa(1/e*(2)) 


ip(z) (Bw) Pa(y*(w)) Pa (1/p4(z))’ 

where the first equality uses (15.48) and (15.49), the second uses (15.44) with w 
in place of z, and the third follows by raising (15.48) to the fourth power to obtain 
p*(w) = 1/y*(z). Comparing (15.50) with (15.44), we conclude that 

Qa(I/u*) _ Pa(u") 

Pg(1/ut) = Qa (ut) 


as rational functions in u with coefficients in Q(i). Thus 


(15.50) 
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Recall from Step 4 that deg(P3(u)) = d, where d = (N(8) — 1)/4. In Exercise 11 
you will show that the above equation implies that 


(15.51) u4Pg(1/u) = \Qe(u) 


for some nonzero constant 4 € Q(i). However, if we evaluate (15.44) at z= 3 and 
use (15.45) and y(#) = 1, then we obtain 


if =ié Ps (1) ; 
Qa(1) 
Thus Pg(1) = Qg(1) £0. Then substituting u = | into (15.51) implies that \ = 1, so 
that Qg(u) = u4Pg(1/u). This completes the proof. rT 


Here are two examples of Theorem 15.4.4 from earlier in the chapter. 


Example 15.4.7 When § = 3, equation (15.17) gives 


(32) = pte) LORE O 
In the notation of Theorem 15.4.4, this means 

P;(u) =u?+6u—3 and Q3(u) =u?P;(1/u) = 14+ 6u—3u’. 
These polynomials have degree (N(3) — 1)/4 = 2. Note also that if = —1, since 


= —1 mod 2(1 +i). 
When § = 5, equation (15.19) gives 


where 


Ps(u) = u® + 50u> — 125u* + 300u? — 105u? — 62u+5, 
Qs(u) = 1+50u~ 125u*+ 300u3 — 105u* — 62u5 + Su°. 


These polynomials have degree (N(5) — 1)/4 = 6 and satisfy Q5(u) = u°Ps(1/u). 
Furthermore, we have i* = 1, since 5 = 1 mod 2(1 +i). <P 


In general, one can show that when n > 0 is in Z, the polynomials P,(w) and Q,(u) 
from Theorem 15.4.4 lie in Z{u]. 


C. Multiplication by Gaussian Primes. When § is an odd prime in Zi], 
Theorem 15.4.4 has the following important refinement due to Eisenstein. This 
result will play a crucial role in the proof of Abel’s theorem. 


498 THE LEMNISCATE 


Theorem 15.4.8 Let 8 € Z[i| be an odd prime, and let 
Pa(u) =u4 + aut! +---+a4 € Ziil[u], d= (N(8)-1)/4, 


be the corresponding polynomial from Theorem 15.4.4. Then: 
(a) aj,...,aq are divisible by B and ag =i © 8, where 8 = i® mod 2(1 +i). 
(b) Pg(u*) is irreducible over Q(i). 


Proof: Our proof will follow [14] and is based on Eisenstein’s original proof from 
1850 [Eisenstein, pp. 556-619]. We first observe that the Schonemann—Eisenstein 
criterion, stated in Theorem 4.2.3 for polynomials in Z[u] and primes in Z, also applies 
to polynomials in Z/i][u] and primes in Z[i]. You will prove this in Exercise 12. Then 
part (a) implies that 


Pa (ut) =u? 4 aqyu4@-) 4... 4.44 € Ziil[ul 


satisfies the criterion for the Gaussian prime ( and hence is irreducible over Q(i). 
Thus part (b) of the theorem follows from part (a). 
Proving part (a) will be harder. Since 8 is odd, Theorem 15.4.4 implies that 


y4(z) +ayy*4-Y(z) +--- +.a¢ 


15.52 z) =F y(z ’ 
(15.52) (82) = Pe) 
where the coefficients a),...,ag € Z[i] depend on 8. To prove part (a), we will 
analyze the relation between a),...,ag and @ by expanding each side of (15.52) as a 


power series in z. 
Several power series will appear in the proof. The first comes from 


jeWi tat | ++ +44 
I+ayut+---+agut ’ 


which we write as 
et +ay(B)ut—! +--+ a4(B) 
1+a,(B)u+---+ag(B)ut 


to emphasize the dependence on (. This rational function is analytic at u = 0 (the 
denominator doesn’t vanish at 0) and hence has a power series expansion 


d d-l 
je +ai(B)u +: ea =Yon(s 
(15.53) l+ai(B)ut+---+aa(6 


= bol 3) + b,(B)u+bo(B)u?+---. 


In Exercise 13 you will prove that b,(8) € Z[i] for all k. This follows because the 
constant term in the denominator is | and the other coefficients lie in Z[i]. Using the 
power series (15.53), the multiplication formula (15.52) can be written 


(Bz) = v(z)(bo(B) + b1(B)e*(z) + b2(B)y*(z) +--+) 


(15.54) 
= bo(B)p(z) + b1(8)y°(z) + b2(B)y(z) +: 
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The second power series comes from y(z). Since y(z) is analytic at z = 0, it 
can be expanded in a power series in z. In Exercise 14 you will use y(iz) = iv(z), 
y’ (0) = 1, and y’?(z) = 1 — y*(z) to prove that the power series has the form 


lee} 
(15.55) p(z) =zt Soej4i"! =Z+aPtoz’?+, ciEQ. 
j=l 


You will also show that c; = 5 and c2 = i Then replacing z with 8z in (15.55) 
gives the third power series 


(15.56) y(Bz) = Soc piti tint = Bzt+c1B°2° +2892? ++. 
j=0 


From here, the proof proceeds in three steps. Here is an overview of what we will 
do in each step: 


e Step 1. Derive a formula for b;,(8) in terms of § that holds for all odd @ € Z[i]. 
This will follow by substituting the series for y(z) and y(Gz) into (15.54). 

e Step 2. Prove that 6 divides bo(8),...,ba-1(8) when 8 is an odd prime. This 
will be done by analyzing the formula of Step 1 using a clever idea of Eisenstein. 

e Step 3. Relate a:(8),...,a¢(8) to bo(8),...,ba_1(G) and conclude that 8 divides 
ai(8),...,aa(B). This will follow easily from (15.53). 


We now turn to the first step. 


Step 1: Express b;() in terms of 8. If we substitute (15.55) and (15.56) into the 
identity (15.54), then we obtain 


Bz+ cy Bz +.628929 +++» = bo(B)(z+e1z> +0229 +--+) + 
098) bi (B)(zteie tea? +++) + 
bo(B\(zteairtoart-)Y +. 


When we expand the right-hand side of (15.57), a given power of z appears only 
finitely often, since all terms of 


Ktadtoad+- =A pert tered +)! 


have degree > 4j + 1 in z. In Exercise 15 you will show that up to degree 9 in z, the 
right-hand side of (15.57) begins with 


(15.58) bo(B)z+ (bo(B)ci + bi(8))z2° + (Bo(B)c2 + 5b1(B)cr + ba(B))2°+°+°- 
Since this equals 8z + c8°z> + c26°z? +--+, comparing coefficients gives 


B = bo(8), 
c1B° = bo(B)c1 +b1(8), 
2B? = bo(B)cr + 5b1(B)c1 + b2(8), 
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and then solving for bo(f),b1(8),b2(8) yields 


bo(B) = B, 
b\(B) = B(ciB* — 1), 
b2(8) = B(c288 — 5c784 + Sct — cn). 
These equations hold for all odd 8 € Z[i]. We will see below that bo(8) = is very 


important. 


In general, one can prove (see Exericse 16) that for any k, there is a polynomial 
S;,(u) € Q[u] of degree 4k such that 


(15.59) bi, (8) = BS.(8), 8B € Z[i] odd. 


This follows because the c; all lie in Q. The crucial thing here is that the same 
polynomial $;,(u) works for all odd 8. For example, since c; = —7, the above 
equations imply that 


bi(8) = BS\(B),  Si(u) =—ut+ 


Step 2: Prove that 6 divides bo(@),...,ba_1(8) when ( is an odd prime. The 
equation (15.59) seems to imply that b,(8) is a multiple of 6 for all k > 0. The 
problem is that S,(u) € Q[u] need not have integer coefficients, as shown by S;(u). 
Hence we need to study the denominators of the coefficients of S;,(u). 

Let s; be the least common multiple of these denominators. Then 


Sk(u) = | Tk(u), 


al- 


where s;, € Z\ {0}, T,(u) € Z[u], and +1 are the only integers dividing s, and all 
coefficients of T,(u). Eisenstein observed that if a € Z|i] is an odd prime, then 


(15.60) als, > N(a) <4k+1. 
To prove this, first observe that (15.59) implies that 
(15.61) sxb,(B) = BT,(8), 8 € Zi] odd. 


We noted above that b;,(3) always lies in Z[i]. This means that if an odd Gaussian 
prime a divides s;,, then a also divides 8T,,(3). It follows that 


(15.62) BT;.(B)=0 moda, f € Zi] odd. 


Then consider the following: 


e Since a is odd, the proof of Lemma 15.4.6 shows that elements of Z[i]/a@Z[i] are 
of the form [f], 8 odd. Thus (15.62) implies that the reduction of uT,(u) modulo a 
is a polynomial with at least |Z[i]/aZ[i]| roots. 

e Since a divides s;, the definition of s, shows that the reduction of u7;,(u) modulo 
a is a nonzero polynomial of degree at most 44+ 1. Hence the reduction has at 
most 4k + | roots since Z[i]/aZ[i] is field by Lemma 15.4.2. 
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These bullets imply that |Z[i]/aZ[i]| <4k+1. However, |Z[i]/aZJi]| = N(a) by 
Lemma 15.4.2. Thus N(@) < 4k +1, and (15.60) follows. 
Now fix an odd Gaussian prime 8. Then (15.60), applied to 8, tells us that 


N(B) > 4k+1 => Bts,. 


Note that N(8) > 4k+ 1 if and only if k < d = (N(8) — 1)/4. It follows that 815; 
for k =0,...,d—1. Since is prime, (15.61) implies that 6 divides b,(8) for 
k =0,...,d—1. This is what we needed to prove. 


Step 3: Relate a)(8),...,ag(8) to bo(8),...,ba-1(G). This is easy, for if we write 
(15.54) in the form 


i€ (u4 +a,(B)u4-' +--+» +a4(8)) 
= (1 +41 (8)u+-+-+a4(8)u4) (Spe obe(B)u*) 


and multiply out the right-hand side, then comparing coefficients of the powers of u 
gives the equations 


i®a,(B) = ay-1(B)bo(B) + a4-2(8)b1(8) +--+ ba1(8). 


The a;(@) lie in Z[i], and bo(8),...,ba—1(8) are divisible by 6 by Step 2. It follows 
that in the above equations, the right-hand side is always divisible by G. This shows 


that G divides a,(),...,aa(@), since i® is a unit. Furthermore, we proved earlier that 
bo(G) = 8, so that the first equation implies that ag(8) = i-©8. This completes the 
proof of part (a). a 


Mathematical Notes 
Here are some further comments about complex multiplication. 


= Complex Multiplication. In our discussion of elliptic functions in Section 15.3, 
we mentioned that the Weierstrass g-function ¢(z;w1,w2) for periods w1,w2 has 
an addition law. It follows easily that it also satisfies multiplication formulas for 
n € Z that generalize Theorem 15.2.5. However, the g-function rarely has complex 
multiplication. More precisely, g(z;w1,w2) has complex multiplication by some 
B¢€C\Z if and only if w/w; is a root of a quadratic polynomial with integer 
coefficients. This means that w2/w, lies in an imaginary quadratic field, which 
is a field of the form Q(,/—m) for some m > 0 in Z. For example, the periods 
w, = (1 —i)w,uy = (1 +i)m@ of Abel’s function y(z) have ratio 
w (l+i)o_, 


wy (l—ijo 
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which is a root of x* +1 =0. So the associated imaginary quadratic field is Q(i). 
In general, elliptic functions with complex multiplication have a deep relation to 
imaginary quadratic fields. This is discussed in books such as [3], [11], [17], and 
[20]. This is also related to class field theory, which will be discussed in the 
Mathematical Notes to Section 15.5. 


Historical Notes 


In addition to the general theory of elliptic functions, Abel also considered the 
lemniscatic function y(z) we’ve been studying. Let m+ pi € Z[i] be odd, and set 
x = (6). Then Abel states complex multiplication by m+ pi as “one has 


p(mt pid =x.T, 


where T is a rational function of x*” [Abel, Vol. I, p. 354]. As an example, he writes 
the formula for complex multiplication by 2 + i as 


2-2x8+i(1—6x44x°8) | 1-2i-x4 


1 — 2x4 + 5x8 ~ IT 


p(2+id=x T1214" 


This is remarkably close to what we did in Example 15.4.5. 

Eisenstein also has an important role to play in this story since he was the first 
to prove Theorems 15.4.4 and 15.4.8. Here is an extract of a letter that he wrote to 
Gauss in 1847 [Eisenstein, p. 845]: 


If m = a+ bi is an odd complex integer, p is its norm and 


_U _ Aoxrtix? +--+ 4Q-p/ar? 
y= v- 14 Bix4+-+++4+Bep—1y jax?! 


is the algebraic integral of the equation 


[ati F =m f axis, 


then I have further shown that for a two-term complex prime number m the 
coefficients of the numerator, except for the last which is a complex unit, and the 
coefficients of the denominator, except for the first which = 1, are all divisible 
by m. I conjecture that the theorem is also correct, when m is a one-term prime 
number... 


Here, a “two-term” odd complex prime is m = a + bi such that p = a? + b? is prime 
in Z with p = 1 mod 4, and a “one-term” complex prime is a prime in Z such that 
p =3 mod 4. In this letter, Eisenstein could prove part (a) of Theorem 15.4.8 only 
in the “two-term” case, though later he obtained a general proof. Also, if we think 
of y(z) as the inverse function of the elliptic integral f (1 —14)~!/? dt, then it should 
be clear that the displayed equation in Eisenstein’s letter refers to the multiplication 
formula for (mz) in terms of y(z). 

The clearest statement of Eisenstein’s irreducibility criterion appears in a paper 
he wrote in 1850 [Eisenstein, p. 542], where we find the following theorem: 
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If in a polynomial F(x) of x of arbitrary degree whose coefficient of the 
highest term is = 1, and all following coefficients are (real, complex) integers, 
in which a certain (real resp. complex) prime number m appears, if in addition 
the last coefficient is = em, where € represents a number not divisible by m; then 
it is impossible to bring F(x) into the form 


(x! tayxt tee bay) (x + bx’ +--+ dL), 


where pp and v > 1, »+v = the degree of F(x), and all a and B are (real resp. 
complex) integers; and the equation F(x) = 0 is accordingly irreducible. 
The reason Eisenstein states the theorem for both Z and Z[i] is that he probably 
discovered it first over Z[i] in his study of complex multiplication on the lemniscate 
and then realized that it also applies over Z. 

When we discussed the Schonemann-Eisenstein criterion in the Historical Notes 
to Section 4.2, it was easy to explain what led Schonemann to the criterion, namely, 
does reducibility modulo p imply reducibility modulo p*. But as we’ve seen in 
this chapter, it was a much richer mathematical context that led Eisenstein to his 
discovery. See [4] for more on Eisenstein and his criterion. 


Exercises for Section 15.4 


Exercise 1. Prove (15.36). 


Exercise 2. Let a € Z[i] be nonzero. The goal of this exercise is to prove part (a) of 
Lemma 15.4.2, which asserts that |Z[i|/aZJi]| = N(a). The idea is to forget multiplication 
and think of Z[i] and Z[i]/aZ[i] as groups under addition. Let m be the greatest common 
divisor of the real and imaginary parts of a, so that a = m(a+ bi), where gcd(a,b) = 1. Then 
pick c,d € Z such that ad —- bc = 1. 

(a) Show that the map Z[i] > Z  Z defined by 


ptvit—- p(d,—b)+v(—c,a) = (ud — vc, -pb + va) 


is a group isomorphism under addition. 
(b) Show that the map of part (a) takes a and ia to (m,0) and (—m(ac + bd), m(a’ + b’)), 
respectively. Then use this to show that the map takes aZ/i] C Z[i] to the subgroup 


mZ@m(a’+b°)ZCZOZ. 


(c) Use part (b) to conclude that |Z[i]/aZ[é]| = N(a). 

Exercise 3. Prove part (b) of Lemma 15.4.2. 

Exercise 4. Prove (15.38). 

Exercise 5. Derive the two formulas for p((2+ i)z) stated in Example 15.4.3. 
Exercise 6. Prove the third and fourth lines of (15.42). 

Exercise 7. Supply the details omitted in the proof of Step 1 of Theorem 15.4.4. 


Exercise 8. Consider the finite ring Z[i]/2(1 + /)Z[i], and let 6 € Z[i] be odd. 

(a) Prove that (Z{i/2(1 + é)Z[i})” = {+[1],+[i]}, and then explain why this implies that 
8 =i* mod 2(1 +i) for some € € {0,1,2,3}. 

(b) Prove that p(62) =i°. 
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Exercise 9. Suppose that we have relatively prime polynomials Ps(u),Qs(u) € Z[i][u] such 
that Qg(0) = 1. Prove that uPs(u*) and Qg(u*) have no common roots in C. 


Exercise 10. Let w = z+ (1+). Use (15.48) and 8 = i* mod 2(1 + i) to show that 
o(Bz)p(Bw) = P17", 


Exercise 11. Let F bea field, and let A(u), B(u) € F'[u] be nonzero relatively prime polynomials 


such that 

B(1/u) _ A(u) 

A(1/u) ~ B(u) 
in F(u). Let d = deg(A). Prove that d = deg(B) and that there is a constant \ € F* such that 
u“A(1/u) = AB(u). 


Exercise 12. Let 9 € Z[i] be prime, and let f = agu® + ayu2~'+---+ag € Z[i][u]. Prove 
the Schénemann-Eisenstein criterion over Z[i], which states that if Bao, Blai,...,B|aa, and 
8" taq, then f is irreducible over Q(i). 


Exercise 13, Prove that the coefficients b, (8) defined in (15.54) lie in Z[i]. 


Exercise 14. The function (z) is analytic at z = 0 and hence has a power series expansion. 
(a) In Exercise 3 of Section 15.2, you used y’*(z) = 1 — p*(z) to show that p” (z) = —2°(z). 
Use these two identities to prove by induction that for every n > 1, there is a polynomial 
G,(u) € Z[u] such that yp (z) equals G,(y(z)) if n is even and G,(p(z)) ¢'(z) if n is 
odd. 
(b) Use part (a) to prove that the coefficients of the power series expansion of (z) at z= 0 
lie in Q. 
(c) Use part (b) and (iz) = i:p(z) to show that y(z) = >@¢)z/*", c;} EQ. 


j=0 
(d) Show that cp = 1,¢1 = -> and c2 = wa 
Exercise 15. Show carefully that (15.58) follows from (15.57). 


Exercise 16. Prove that for each integer k > 0 there exists a polynomial S;(u) € Q[u] of degree 
4k such that (15.59) holds for all odd 8 € Z[i]. 


Exercise 17. Let n € Z be an odd integer. Prove that n = (—1)—/? mod 2(1+ i). This 
shows that when n is an odd integer, we have i* = (—1)~")/? in the formula for (p(nz) given 
in Theorem 15.4.4. 


15.5 ABEL'S THEOREM 


In this final section of the book, we will prove Abel’s theorem about straightedge- 
and-compass constructions on the lemniscate. The tools used will include Galois 
theory and the theory of complex multiplication developed in Section 15.4. 


A. The Lemniscatic Galois Group. Let 7 be an odd positive integer and consider 
L=Q(i,9(%)). 
We will see that the Galois group of Q(i) Cc L involves the group 
(Z{i] /nZ{i))” 
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of units in Z[i]/nZ[i]. Since Z[i] is a PID, a coset [a] lies in (Z[i]/nZ[i])* if and only 
if a is relatively prime to n in Z[i] (see Exercise 1). 


Theorem 15.5.1 Q(i) C L is a Galois extension and there is a one-to-one group 
homomorphism 


Gal(L/Q(i)) > (Z[i]/nZfi))*. 
In particular, Gal(L/Q(i)) is Abelian. 


Proof: Let An(u) = uPa(u*) be the n-division polynomial defined in part (d) of 
Theorem 15.4.4. The theorem tells us that the roots of A,(u) are given by 


(15.63) y(a#), ae€ Zi] odd 


and the proof of Lemma 15.4.6 shows that for each root, the associated a € Z/i) is 
unique modulo nZ{i]. 

Since each @ in (15.63) is odd, the complex multiplication formula for y(az) 
given by Theorem 15.4.4 shows that (a2) is a rational function in y(2) with 
coefficients in Q(i). It follows that A,(u) splits completely in L = Q(i,y(2)). 
Since one of the roots is ¢(@), it follows immediately that L is the splitting field of 
A,(u) over Q(i). Thus Q(i) C L is a Galois extension. 

Now take o € Gal(L/Q(i)). Then o(y(2)) is a root of A,(u) and hence is one 
of the numbers (15.63). Thus there is a € Z[#] odd such that 


(15.64) o(9(%)) = 9(0%). 


As noted above, a is unique modulo nZ[i]. 
In Exercise 2 you will use Theorem 15.4.4 to show that if 8 € Z[i] is odd, then 


(15.65) o(v(62)) = (082). 


We now prove that a is relatively prime to n. Let m be the order of o in Gal(L/Q(i)), 
so that o” is the identity. Then repeatedly applying (15.65) yields 


(FZ) =o" ((F)) =9(a"F). 
By uniqueness, we conclude that 
1 =a™ mod n. 
Hence a is relatively prime to n in Z[i], so that o +> [a] gives a well-defined map 
(15.66) Gal(L/Q(i)) > (Z[i]/nZ[i])*. 


If o and r map to a and 8, respectively, then (15.65) easily implies that or (y(=)) = 
77) (a8 2 ) . Thus or maps to a8, which shows that the map is a group homomorphism. 
Furthermore, if [a] = [6] in (Z[i]/nZ[i])*, then 


a=£8+(a+ib)n 
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where a+ ibis even because a, 3, and n are odd. Then Proposition 15.3.1 implies that 


o(v(2)) =¢(aF) = 9(62) =7(9(2)), 


from which we conclude that the map is one-to-one since y(2) generates L over 
Q(i). This completes the proof. 7 


Since Abelian groups are solvable, one corollary of Theorem 15.5.1 and Chapter 8 
is that the coordinates of the n-division points of the lemniscate are expressible by 
radicals over Q. (You will prove this assertion carefully in Exercise 3.) 

The homomorphism (15.66) constructed in Theorem 15.5.1 is the lemniscatic 
analog of the homomorphism 


Gal(Q(¢,)/Q) > (Z/nZ)* 


studied in Chapter 9. We will say more about this analogy in the Mathematical Notes 
at the end of the section. 


B. Straightedge-and-Compass Constructions. We now have the tools needed 
to prove Abel’s theorem on the lemniscate. 


Theorem 15.5.2 Let n be a positive integer. Then the following are equivalent: 
(a) The n-division points of the lemniscate can be constructed using straightedge 
and compass. 
(b) p(2) is constructible. 
(c) nis an integer of the form 
n= 2p Pr; 


where s > 0 is an integer and p,,...,p, are r > 0 distinct Fermat primes. 


Proof: The implication (a) > (b) is easy, since (22) is the polar distance of an 
n-division point. The converse (b) = (a) follows from part (b) of Corollary 15.2.7. 

The proof of (c) > (b) will be a nice application of Theorem 15.5.1 together with 
some results of Section 15.2. We first observe that by part (c) of Corollary 15.2.7, 


20 


y(#) is constructible provided that 


20 20 


o (3%), o( Pi ),--- (2) are constructible. 


Since o( 22) is constructible by Proposition 15.2.3, we need only show that y(22) 
is constructible when p is a Fermat prime. 
By part (a) of Corollary 15.2.7, 9(72) is constructible whenever ( 


the latter, Theorem 15.5.1 gives the Galois extension Q(/) C L = Q(i, ( 


ld 


) is. For 
)) with 


ula 


Gal(L/Q(i)) ~ a subgroup of (Z[i]/pZ[i])”. 
In Exercise 4 you will use the methods of Chapter 10 to prove that if 


(15.67) |(Z[i]/pZ[i])"| = a power of 2, 
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then (2) is constructible. 


We will show that (15.67) holds whenever p = 22" +1 is a Fermat prime. The 
case p = 3 is easy (see Exercise 5). If p > 3, then m > 1, so that 
p= 41=(2""' +i)(2?""' -i) =B3, 
where £3, 8 are nonassociate primes in Z[i] of norm p. In this case, Exercise 6 and 
Lemma 15.4.2 give isomorphisms 


2ii]/pZ[i] = Zli]/SSZli| ~ Zii]/B2li x Zli]/GZli| ~ F, x F,. 
Thus nat 
| (Zli|/pZUi])"| = [Ep x Fp] = (p-1P =2"". 
This proves (15.67) for all Fermat primes and completes the proof of (c) => (b). 

It remains to prove (b) => (c). This is where we will use the irreducibility result 
proved in Theorem 15.4.8. Let n be an integer such that y(?2) is constructible. We 
may assume that n > 1 since the theorem is trivially true when n = 1. Furthermore, 
the doubling formula (15.14) implies that we may assume that n is odd (be sure you 
can explain why), and Proposition 15.2.3 shows that y(2) is constructible. 

Let p be a prime dividing n. Then p is odd because n is. Let 8 be acomplex prime 
such that p = 8 if p = 3 mod 4 and p = 68 if p=1 mod 4. Thenn/f € Z[i] is odd 
(since n and £ are), so that 5 is an odd multiple of ee This makes it easy to show 


that 

e(F) € Qi, 9(%)) 
(see Exercise 7). It follows that (4) is constructible, since i and y(2) are. By 
Corollary 10.1.8 from Chapter 10, the minimal polynomial of 9(F) over Q has 
degree equal to a power of 2. Then the Tower Theorem shows that the minimal 
polynomial of (4) over Q(j) also has degree equal to a power of 2. 

Theorem 15.4.4 implies that y() is a root of uPg(u*). It is easy to see that 
9(F) #0 (see Exercise 8), so that (4) is a root of Ps(u*). Since 6 is an odd 
prime, Pg(u*) has degree N(8) —1 by Theorem 15.4.4 and is irreducible over Q(i) 
by Theorem 15.4.8. This proves that the minimal polynomial of y(4) over Q(i) has 
degree N(8) —1. 

When p = 6 for p = 3 mod 4, we have N(8) — 1 = p?-—1=(p+1)(p—1). One 
easily sees that this is a power of 2 if and only if p = 3 (see Exercise 9). On the other 
hand, when p = 68 for p = 1 mod 4, we have N() — 1 = p—1, which is a power 
of 2 if and only if p is a Fermat prime. 

Thus the only primes dividing n are Fermat primes. To complete the proof of the 
theorem, we need to show that p’|n cannot occur. So assume that p|n, where p is 
prime. Then there is an odd complex prime ( such that §?|n. By Exercise 7, 


up = 9(#) EL = Q(i,¢(*)), 


which implies as above that up is constructible. Hence the degree of its minimal 
polynomial over Q(i) is a power of 2. We will prove that the minimal polynomial 
has degree N(8)(N(8) — 1). This is not a power of 2 since N(8) = p or p”. 
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Since £ is odd, Theorem 15.4.4 implies that 
0(F) = 0(8%) = o(F) 


B 
Since p(#) is a root of Ps(u*), this formula for (4) gives the equation 


sna’) =n sf) 


If we write Pg(u) = u4 +ayu?—! +---+ag,d = (N(B) —1)/4, then clearing denom- 
inators in the above equation shows that up is a root of 


P(u) = u“4 Pg (u*)4 + ayu4~*Po(u*)4 49 (u*)* + --- + aaQa(u*)™. 


This has coefficients in Z[i] and degree 4d(4d + 1) = N(8)(N(G) — 1), since Pg(u), 
Qg(u) € Zli\[u] have degree d. Furthermore, Theorem 15.4.8 implies that 


(15.68) B divides a,,...,aq. 
Thus Pg(u) =u? mod 8. Using this and (15.68), we see that 
P(u) = uX(8)(M(8)—-)) mod B, 


since 4d(4d + 1) = N(8)(N(8) — 1). Furthermore, Qg(0) = 1 by Theorem 15.4.4, 
so that the constant term of P(x) is 


P(0) =0+---+0+agQ(0)%4 = ag. 


Theorem 15.4.8 shows that aq is not divisible by 6, so that by the SchGnemann-— 
Eisenstein criterion over Q(i) (proved in Exercise 12 of Section 15.4), P(u) is ir- 
reducible over Q(i). Thus the minimal polynomial of ug over Q(i) has degree 
N(8)(N(8) — 1). The proof is now complete. a 


Mathematical Notes 
Here are comments about some ideas related to this section. 
« The Lemniscatic Galois Extension. Let n € Z be odd and positive. The field 
L=Q(ie(2)) 


played an important role in our treatment of the lemniscate. This field has a nice 
relation to the elliptic curve y* = 4x? + x discussed in the Mathematical Notes to 
Section 15.3. To explain this, first note the surprising fact that yp’ (2) € L. You will 
prove this in Exercise 10. This means that 


(15.69) L=Q(i,e(%),9'(%)). 
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Then, using the formulas (15.32) and (15.33), one can show that L is the extension 
of Q(i) generated by the x- and y-coordinates of the (1 +i)n-torsion points on the 
elliptic curve y* = 4x3 +x. 

In general, extensions generated by torsion points of elliptic curves are an impor- 
tant topic in number theory. See [18] or [20] for a nice introduction. 


« Abelian Extensions of Q(i). In Theorem 15.5.1, we constructed a one-to-one 
group homomorphism 


Gal(L/Q(i)) — (Z{i|/nZ[i])" 


when n is odd and positive. Thus Q(i) C L is an Abelian extension. A remarkable 
fact is that as n ranges over all positive integers, the fields L defined in (15.69) contain 
all Abelian extensions of Q(i), in the sense that if Q(i) C K is a Galois extension 
with Abelian Galois group, then there is an integer n > 0 such that 


Qi) CK CL=Q(i,e(F),¢'(F)). 


The proof of this result uses class field theory and complex multiplication. See, for 
example, [17, Ch. II, Example 5.8]. 


= Class Field Theory. A number field K is a finite extension of Q. The main goal 
of class field theory is to describe all Abelian extensions of K. For example, when 
K = Q, the Kronecker—-Weber Theorem from the Historical Notes to Section 6.5 
states that every Abelian extension of Q is a subfield of the cyclotomic extension 
Q(¢,) for some n. Similarly, we noted above that every Abelian extension of Q(i) is 
a subfield of the lemniscatic extension (15.69) for some n. 

The general version of class field theory describes Abelian extensions of a number 
field K, though the description uses the language of algebraic number theory and is 
not as explicit as for K = Q or Q(i). See [3, §8], [11, Sec. 8.4], or [17, §II.3] for a 
brief review of class field theory. In the special case of an imaginary quadratic field 
K, the theory of complex multiplication uses certain elliptic curves to give an explicit 
description of the Abelian extensions of K and their Galois groups. This is described 
in (11, Ch. 10] and [17, Ch. II]. 

For example, the theory of ray class fields implies that if n is odd, then 


Qi) CL’ =Q(i,¢ (F)) 
is a Galois extension with Galois group 
(15.70) Gal(L'/Q(i)) = (Zfi]/(1 + nZ[i))*/{+[1],+[}. 


Using (15.70), one can get a shorter proof of Theorem. 15.5.2 that doesn’t require 
the hard work of Theorems 15.4.4 and 15.4.8. This is closely related to the elegant 
treatment of Abel’s theorem given in [15]. 

The theory of elliptic curves is an important and beautiful area of number the- 
ory. There are also many unsolved problems of great interest. But only certain 
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elliptic curves—those with complex multiplication—have the special link to Abelian 
extensions of quadratic number fields. 


« Origami and the Lemniscate. Besides straightedge-and-compass constructions 
on the lemniscate, one can also use origami from Section 10.3 to divide the lemiscate 
into arcs of equal length. As explained in [5], the answer is almost the same as for 
the circle. The proof uses the class field theory developed in [15]. 


Historical Notes 


The story of this section begins with Article 335 of Disquisitiones [6], where 
Gauss introduces his theory of geometric constructions and cyclotomic fields. He 
then goes on to say 

The principles of the theory that we are going to explain actually extend much 
farther than we will indicate. For they can be applied not only to circular functions 
but just as well to other transcendental functions, e.g. to those that depend on 
the integral f[1/,/(1 — x*)]dx and also to various types of congruences. Since, 
however, we are preparing a large work on those transcendental functions ..., 
we have decided to consider only circular functions here. 


Gauss’s “large work” never appeared, though the reference to the lemniscate would 
have been unmistakable to any nineteenth-century reader. 

Abel was clearly intrigued by Gauss’s remark. He read Disquisitiones carefully 
and understood Gauss’s method for solving cyclotomic equations by radicals. He 
also defined a version of the function y(z) for the integral 


—_S=—s" 
(1 —c?t?)(1 + e727) 


and gave formulas for multiplication by n. The resulting n-division polynomials lead 
to certain algebraic equations, and one of Abel’s goals in Recherches sur les fonctions 
elliptiques [Abel, Vol. I, pp. 263-388] is to determine whether these equations are 
solvable by radicals. Abel notes on page 352 that the n-division polynomial, 


taken in its full generality, is probably not solvable algebraically for arbitrary 


values of e and c, but nevertheless, there are particular cases when one can solve 

it completely . .. 
For Abel, the case of greatest interest was the lemniscate given by e = c = 1, though 
he also knew that e = c/3 and e = c(2+ V3) give polynomials that are solvable 
by radicals. From the modern point of view, these “particular cases” correspond to 
elliptic curves with complex multiplication. 

As an example of Abel’s methods, let y(z) be the lemniscatic function and fix a 
prime p = | mod 4. Write p = 4v + 1 = a* + 8’, a, 8 € Z. Then, on pages 357 and 
358, Abel asserts that 

one has an equation 
R=0 


24 92_ 
of degree org at = 2v, whose roots are 


v° (5), 9"(26),p° (365)... (2v8), 
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where for brevity one supposes 6 = aia 

Given this, one can easily solve the equation R = 0, by aid of the method of 
M. Gauss. 

Letting € be a primitive root of a? + 67, I say that one can express the roots 
as follows: 


yp’ (5), p"(€6), yp (e75),y? (e°5) . y'(e"—"6), 


Here, w is what we call w, and ¢€ is an integer whose congruence class modulo 
p=? + 8? generates the multiplicative group F} ~ (Z[i]/(a + Bi)Z[i])*. 

This quotation shows Abel using a@ + fi to study the p-division points on the 
lemniscate. Furthermore, the roots listed above have an important structure. We may 
assume that € is odd, so that the multiplication formula for y(ez) easily implies that 


y’ (ez) = 8(¢" (2) 


for some rational function 6(u) with coefficients in Z. It follows that if we let 
xo = y’ (6), then the roots of R = 0 can be written 


x9, (x0), 07 (xo), 07 (xo), eee 0°” (xo), 


where the exponents refer to composition, i.e., 67(x9) = 9(0(x0)), etc. Compare this 
with Abel’s 1829 paper Mémoire sur une classe particuliére d’équations résolubles 
algébriquement, where he says that radical solutions exist 


if all of the roots of an equation can be expressed by 
x, 0x,07x,0°x,...0" |x, where 6"x = x, 


Ox being a rational function of x, and 6x, 6°x, ... the functions of the same form 
as Ox, taken two times, three times, etc. 


(See [Abel, Vol. I, pp. 478-479].) Abel proves that any equation whose roots satisfy 
this condition is solvable by radicals. These quotations show that Abel’s condition 
arises naturally from his work on the lemniscate. 

We saw in Section 6.5 that Abel considered a more general class of equations 
in his “classe particuli¢re” paper. Rather than assume as above that all of the roots 
are generated by iterating a single rational function, suppose instead that f(x) = 0, 
f © F [x], has a root xo with that property that any other root is of the form 0,(xo) for 
some rational function 6; € F(u). If we further assume that 


6;(8;(x0)) = 8;(6;(x0)) 


for all i and j, then Theorem 6.5.3 implies that the Galois group of the splitting field 
of f over F is Abelian and hence solvable by radicals over F by Chapter 8. 

The Historical Notes to Section 6.5 describe how Abel’s equations led to the 
modern Abelian group via the nineteenth century Abelian equation. But in Chapter 6, 
we didn’t know what led Abel to these particular equations. Now we do—it was his 
work on the lemniscate! Thus the term “Abelian group,” known to every beginning 
algebra student, has an unexpectedly rich history. 
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Kronecker was the first to realize the full power of the equations described by 
Abel. In the 1853 paper where he introduced the term “Abelian equation,’ Kronecker 
conjectured that all Abelian extensions of Q are contained in cyclotomic extensions 
(this is the Kronecker-Weber Theorem from Section 6.5), and he also asserts the 
following [Kronecker, Vol. IV, p. 11]: 


There also exists a close relation between the roots of Abelian equations whose 
coefficients are complex integers of the form a+ bv — 1 and the roots of equations 
arising from the division of the lemniscate .. . 


Kronecker speculated that similar results might hold over imaginary quadratic fields. 
He called this “mein liebsten Jugendtraum” (“the dearest dream of my youth”) in a 
letter written to Dedekind in 1880 [Kronecker, Vol. V, p. 455]. The first complete 
proofs of the theorems of class field theory and complex multiplication were given 
by Tagaki and Fueter in the 1920s. A nice discussion of Gauss, Abel, Eisenstein, and 
Kronecker appears in [19, Ch. 3 and 4]. See also [12, Sec. 8.6]. 


Exercises for Section 15.5 
Exercise 1. Let 8 € Z[i] be nonzero. Then a € Z[i] gives [a] € Z[i]/SZ[i]. Prove that 
[a] € (Z[i]/BZ[i])* if and only if a is relatively prime to f. 


Exercise 2. As in the proof of Theorem 15.5.1, let uy = v(), and assume that o € 
Gal(L/Q(i)) satisfies o(uo) = y(a®), where a € Z[I] is odd. Use the multiplication formula 
for 8 € Z[i] odd to prove (15.65). 


Exercise 3. Use Theorem 15.5.1 and Chapter 8 to prove that the x- and y-coordinates of the 
n-division points of the lemniscate are expressible by radicals over Q. 


Exercise 4. Give a careful proof that (15.67) implies that v(2) is constructible. 
Exercise 5. Prove that |(Z[i] /3Z[i])*| = 8. 


Exercise 6. Let a, € Z[i] be nonzero and relatively prime. Prove the Chinese Remainder 
Theorem for Z[i], which asserts that there is a ring isomorphism 


Zi] /oaBZli] ~ Zfi]/aZli] x Z{i]/BZIi). 


Exercise 7. When evaluating the multiplication formula for y(az) at a complex number Zo, 
one needs to worry about poles and vanishing denominators. 
(a) Let a € Z[i] be odd, and assume that zo is a pole of neither y(z) nor y(az). Prove 
carefully that 0.(*(zo)) 4 0 and that 


plazo) = i p(z0) P 


Exercise 9 of Section 15.4 will be useful. 
(b) Let n be odd, and let p be a prime dividing n. Then let f be a Gaussian prime such that 
p= 6B if p=3 mod 4 and p = G8 if p= 1 mod 4. Use part (a) to prove carefully that 


0(F) € Q(i,e(%)). 
Theorem 15.3.2 will be helpful. 


(c) 
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Let n be odd, and let p be a prime such that p? divides n. Also define f as in part (b). 


Prove that 
(3) € Qi, o(F)). 


Exercise 8. Let 6 € Z{i] be an odd prime. Prove that p( 3) # 0. 


Exercise 9. Let p € Z be prime. Prove that p” — 1 is a power of 2 if and only if p = 3. 


Exercise 10. Let n € Z be odd and positive, and let L = Q(i,g(2)). Use (15.9) and the 
multiplication law for p((n — 1)z) to prove that y’(2) € L. 
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APPENDIX A 
ABSTRACT ALGEBRA 


This appendix summarizes most of the abstract algebra needed for Galois theory. 
Section A.1 reviews basic material on groups, rings, fields, and polynomials. Most 
of this material should be familiar, though it might be a good idea to review the 
notation. Before beginning Chapter 1, readers should also review complex numbers 
and the nth roots of unity from Section A.2. 

The other sections cover a variety of topics. Section A.3 discusses polynomials 
with coefficients in Q. Section A.4 deals with group actions, which are used in several 
places in the text. Section A.5 includes the Sylow theorems, the Chinese Remainder 
Theorem, the multiplicative group of a field, and unique factorization domains. 


A.1_ BASIC ALGEBRA 


We recall some basic material from abstract algebra. 


A. Groups. We assume that the reader is familiar with groups and subgroups. We 
usually write the group operation in a group G as gh for g,h € G, and the identity 
element is denoted e. If G is finite, then |G] is called the order of G. 


Galois Theory, Second Edition. By David A. Cox 515 
Copyright © 2012 John Wiley & Sons, Inc. 


516 ABSTRACT ALGEBRA 


If g € G, then the order of g, denoted o(g), is the smallest positive integer n such 
that 9” = e, if it exists. If g” 4 e for all positive integers n, then o(g) = oo. 
Given a subgroup H of a group G, the left coset determined by g € G is 


gH ={gh|heH} 
and the right coset determined by g is 
Hg = {hg | he H}. 


Two left cosets g;H and g2H are equal if and only if 8) 82 € H. Similarly, 
Hg, = Hz if and only if 828) eH. 

The left cosets gH of H C G partition G into disjoint subsets. Furthermore, if H 
is finite, then each left coset has the same number of elements as H, i.e., |gH| = |H| 
for all g € H. If G is also finite, then the number of cosets is finite. This leads to 
Lagrange’s Theorem, which is stated as follows. 


Theorem A.1.1 Jf H is a subgroup of a finite group G, then |H| divides |G]. . 


The quotient |G|/|H| equals the number of left cosets. This number is the index 
of H in G and is denoted [G: H]. We discuss Lagrange’s version of Theorem A.1.1 
in Chapter 12. 

The above statements also apply to right cosets. In general, the partition of G 
into right cosets can differ from its partition into left cosets. Galois was the first 
to recognize the importance of when these partitions agree. This happens when the 
subgroup H is normal. As is well known, 


H is normalinG <> gH =H¢g forallg eG <> gHg!=H forall g €G. 


When H C Gis normal, the left (= right) cosets form a group under the operation 
81H - 9H = g192.H. This is called the quotient group and is denoted G/H. The 
identity element of G/H is the coset eH = H. 


Example A.1.2 The integers modulon under addition form the quotient group Z/nZ. 
Elements of Z/nZ are sometimes called congruence classes. The congruence class 
of i € Z is denoted [i] € Z/nZ. <b 


We also assume that the reader knows the definition of group homomorphism 
yp: G, — G2. Given such a 9, its kernel is 


Ker(y) = {g € G, | o(g) =e}, 


where é? is the identity of G2, and its image is 


Im(y) = {y(g) |g € Gi}. 


Then Ker(y) is a normal subgroup of G, and Im(v) is a subgroup of G3. 
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If a group homomorphism y : G; — G2 is one-to-one and onto, then the inverse 
function y—! : G2 > G, is also a group homomorphism. Thus ¢ is a group isomor- 
phism. In this situation, we often write y : Gy ~ Go. 

Given a group homomorphism y : G; — G2, the Fundamental Theorem of Group 
Homomorphisms relates Ker(y) and Im(y) as follows. 


Theorem A.1.3 Let yp : G; — G» be a group homomorphism. Then there is a unique 


group isomorphism @ : G,/Ker(y) ~ Im(y) such that @(gKer(y)) = ¢(g) for all 
gEeG,. a 


A group Gis cyclic if there is g € G such that G = {g! | € Z}. When Gis cyclic, 
recall that 


G~ Z, if G is infinite, 
~ | Z/nZ, if |G| =n <oo. 


We have the following result about the subgroups of a cyclic group. 


Theorem A.1.4 Let G be acyclic group. Then: 
(a) Every subgroup of G is cyclic. 


(b) If |G| =n < 00, then for every positive divisor d of n, G has a unique subgroup 
of order d. a 


One way to create cyclic groups is to pick g € G and consider the subgroup 
generated by g, namely 


(g) = {g' |e Z}. 


If g has finite order o(g) < 00, then (g) is acyclic group of order o(g). More generally, 
(S) C G denotes the subgroup generated by a subset SC G. 

If G is a finite group, then applying Lagrange’s Theorem to (g) C G shows that 
o(g) divides |G|. A partial converse is the following classic theorem of Cauchy. 


Theorem A.1.5 Jf a prime p divides the order |G| of a finite group G, then G has an 
element of order p. a 


For us, one of the most important groups is the symmetric group S,. This is 
the group of permutations of n objects, usually thought of as elements of the set 
{1,...,2}. Thus S, is the set of functions 


Si = {o:{1,...,2} > {1,...,2} | o is one-to-one and onto}, 
where the group operation is given by composition of functions, and the identity 


element is the identity function e(i) =i forl <i<n. 
If o €S, is given by o(j) =i; for j =1,...,n, then following Cauchy, we write 


o in the form 
(; 2 oe ") 
o= . : . . 
yy 5) eae ln 
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Also recall cycle notation. Given distinct numbers i,,...,i) € {1,...,2} with / > 2, 
the /-cycle o = (i; i2---i,) € S, is the permutation defined by 


o(i) =’, 

o(i2) = is, 
Al ° 
(A) o(i_1) =h, 

otis) = iy, 


o(i)=i, i¢ {it,...,i)}. 


Note also that 
(ip ig +++ ip) = (ig tp i)) = (ig --tpip ig) = +++ = (ip ig y_-1). 


As usual, a 2-cycle is called a transposition. Every element of S, can be written 
uniquely as a product of disjoint cycles. 

When multiplying cycles, it is important to remember that the operation is com- 
position of functions. For example, consider 


(345)(123)(12) = (1453). 


When we apply the left-hand side, we first operate using (12), then using (123), and 
finally using (345). So we move right to left through the cycles, while inside an 
individual cycle, we move in the opposite direction (e.g., (345) takes 4 to 5). Note 
that some books use different conventions for multiplying cycles. 

Also recall the identity 


(A.2) (ip i2---i) = () (4 G1) (HB) (Hb), 


which expresses an /-cycle as a product of / — 1 transpositions. 

A permutation o € S, is even if it is a product of an even number of transpositions, 
and odd otherwise. It follows from (A.2) that an /-cycle is even when / is odd, and is 
odd when / is even. The sign of o is defined by 


+1, if is even, 


A3 _ 
(A) sen(7) (fh if. is odd. 


Note that sgn : S, — {+1} is a group homomorphism. 

The most important subgroup of S, is the alternating group A,, which is the 
subgroup consisting of all even permutations. It is a normal subgroup of S, of 
index 2. This follows from A, = Ker(sgn). 


Example A.1.6 Note that 


S3 = {e, (12), (13), (23), (123), (132)}, 
Ag = {e,(123),(132)} = ((123)). 
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Furthermore, one can show that 


53, As, ((12)), ((13)), ((23)), {e} 


are all subgroups of 53. <> 


A group G is Abelian if the group operation is commutative, i.e., if gh = hg for 
all g,h € G. Recall that every subgroup of an Abelian group is normal. The reason 
for the name “Abelian” involves the Galois theory of an interesting class of equations 
studied by Abel. This is explained in Sections 6.5, 8.5, and 15.5. 

Given groups G and H, their direct product, or more briefly product, is the 
set G x H = {(g,h) | g € G,h € H} with group operation (g,h)(g’,h’) = (gg’,hh’). 
Products enable us to create new groups from old ones and are used in structure 
theorems, such as Theorem A.1.7 below. The Mathematical Notes to Section 6.4 
introduce a generalization of the direct product called the semidirect product. 

Most courses in abstract algebra prove the following structure theorem for finite 
Abelian groups. 


Theorem A.1.7 Every finite Abelian group is isomorphic to a product of cyclic 
groups of prime power order. a 


Another important group is the dihedral group Dz, of order 2n. This group is 
generated by elements g of order n and h of order 2 such that hgh~! = g~!. Some 
books write D2, as D,. For us, the subscript is the order of the group. 


B. Rings. The reader should also be familiar with rings and ideals from abstract 
algebra. For us, all rings are commutative and have a multiplicative identity. We 
write the additive identity of a ring R as 0 and the multiplicative identity as 1. 

Since R is commutative, a subset J C R is an ideal if and only if / is a subgroup 
under addition and ra € J] whenever r € Randaeé I. 

An ideal / is principal if there is r € R such that J = {rs|s €R}. We say that 
r generates I. Principal ideals of R are denoted either rR or (r). More generally, 
(ri,-+-3%m) = {1 Siri | 51,---,5n € R} is the ideal generated by r),...,7, ER. 

The cosets of an ideal / in R are sets of the form r+/={r+s|s€J}forre€R. 
Two cosets r+/ and s+/ are equal if and only if r—s € J. The set of all cosets is 
denoted R/T and is a ring under the operations 


(r+I4+(stl)=(r+s) +], 
(r+1)-(st+I) =rst+l. 


We call R/I a quotient ring. Since R is commutative with a multiplicative identity, 
the same is true for R/I. The additive and multiplicative identities of R/J areO+J=I 
and | + J, respectively. 


Example A.1.8 Every ideal of Z is principal, so that the ideals of Z are nZ for integers 
n> 0. Integers modulo n under addition and multiplication form the quotient ring 
Z/nZ, where the congruence class [i] is the coset i+nZ. <b 
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For us, aring homomorphism gy: R > S isa function satisfying the usual conditions 
yp(r+s) = y(r) + y(s) and y(rs) = y(r)p(s) for all r,s € R. In this book, all 
ring homomorphisms preserve the multiplicative identity, unless explicitly stated 
otherwise. This means that y(1pz) = 1s, where 1g and ls are the multiplicative 
identities of R and S, respectively. 

Given such a y, its kernel is 


Ker(y) = {r €R| p(r) = 9}, 


and its image is 
Im(y) = {y(r) |r € R}. 


Then Ker(y) is an ideal of R and Im(y) is a subring of S. 

If aring homomorphism vy: R - S$ is one-to-one and onto, then the inverse function 
gy! :S— Ris also a ring homomorphism. Thus ¢ is a ring isomorphism. In this 
situation, we often write py: R ~ S. 

Given a ring homomorphism vy : R > S, the Fundamental Theorem of Ring Ho- 
momorphisms is as follows. 


Theorem A.1.9 Let ~: R — S be a ring homomorphism. Then there is a unique 
ring homomorphism @ : R/Ker(y) ~ Im(y) such that G(r + Ker(y)) = ¢(r) for all 
reR. a 


An integral domain is a ring R such that rs = 0, r,s € R, implies that r = 0 or 
s=0. Section A.5 will discuss a special class of integral domains called unique 
factorization domains. 


Example A.1.10 The ring of integers Z is an integral domain, but Z/6Z is not, since 
[2] - {3] = [6] = [0], yet [2] and [3] are nonzero in Z/6Z. <P 


C. Fields. A field F is a ring such that every nonzero element has a multiplicative 
inverse. To avoid trivial examples, we assume that 0 £ 1 in F. Commonly used fields 
include: 


Q = the field of rational numbers, 
R = the field of real numbers, 
C = the field of complex numbers. 


Note that a field is always an integral domain. Also recall] that the only ideals of a 
field F are {0} and F itself. 

One way to create fields is via the field of fractions of an integral domain R. This 
is defined to be the set 


K={—|rseR, 540}, 
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where we regard r/s and t/u as equal if and only if ru = st. This becomes a ring 
under the operations 


_ ru-+ st 


_ , 


r ft 
= + = 
Ss ou SU 
rt 
Sou su 
If r/s #0, then the multiplicative inverse of r/s is s/r. Thus K is a field. We call K 
the “field of fractions” of R, though the term “quotient field” is also used. 
Note that the function 


yp:R—K 


defined by y(r) = r/1 is a one-to-one ring homomorphism, so that R x y(R). In 
this situation, we usually identify R with y(R). This allows us to regard an integral 
domain R as a subset of its field of fractions K. 


Example A.1.11 The field of fractions of Z is the field of rational numbers Q. <> 


A second important method for creating fields is by means of maximal ideals. 
An ideal M C R is maximal if M # R and for all ideals J of R, M C J C R implies 
J=M or J=R. Most abstract algebra courses prove the following theorem that 
characterizes maximal ideals in terms of their quotient rings. 


Theorem A.1.12 Let M be an ideal of a ring R. Then R/M is a field if and only if M 
is a maximal ideal. . 


For Z, we can determine the maximal ideals as follows. 


Example A.1.13 One easily checks that nZ C mZ if and only if m divides n. It 
follows that pZ is a maximal ideal of Z if and only if p is prime (be sure you see 
why). By Theorem A.1.12, Z/pZ is a field. It is customary to denote this field by 
F,. This field has p elements. In Chapter 11, we describe all finite fields. <p> 


Theorem A.1.12 is used in Chapter 3 to prove that any polynomial with coefficients 
in a field has roots in some possibly larger field. 
We next discuss the characteristic of a field F. Given a positive integer n, define 


n-l=1+.---+1€F, 
—— 
n times 
where 1 is the multiplicative identity of F. 
The distributive law implies that (n-1)(m-1) = (nm)-1. If n-1 =0 for some 
positive n, then let p be the least such number. We claim that p is prime. This is easy 
to see, for if we had p = ab with 0 < a,b < p, then 


0=p-1=(ab)-1=(a-1)(b-1). 
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Since F is an integral domain, we would have a- 1 = 0 or b- 1 = 0, which would 
contradict the minimality of p. Thus p is prime. 

Because of this, we say that F has characteristic 0 if n-1 #0 for all positive 
integers n and has characteristic p if p- 1 = 0 and pis prime. 

Thus Q, R, and C all have characteristic 0, while F, has characteristic p. In 
general, Galois theory is easier in characteristic 0 than in characteristic p. 


D. Polynomials. A polynomial in x with coefficients in a field F is an expression 
f= Gnx" + a,x"! +++ +ayx+ao 


where @,,@n—1,--.,@1,@9 € F. If a, #0, then we say that f has degree n, written 
deg(f) =n. If a, = 1, then we say that f is monic. 

If f and g are nonzero polynomials, then fg is also nonzero, since F is an integral 
domain. It follows easily that 


(A.4) deg(fg) = deg(f) + deg(g). 


Notice also that we have not defined the degree of the zero polynomial. One might 
be tempted to set deg(0) = 0, but this would not be consistent with (A.4) (do you see 
why?). For this reason, we prefer to leave deg(0) undefined. 
The set of all polynomials in x with coefficients in F forms a ring F/x] under 
addition and multiplication of polynomials. Note that F [x] is an integral domain. 
The following division algorithm is proved in most abstract algebra texts. 


Theorem A.1.14 Let f,g € F [x], and assume that g is nonzero. Then there are 
polynomials q,r € F|x] such that 


f=qgtr, where r=0 or deg(r) < deg(g). 
Furthermore, q and r are unique. a 


As an application of this theorem, consider the case when g = x — a for some 
a€F. The division algorithm implies that f = q- (x —a)+r where r € F (be sure 
you see why). Evaluating this equation at x = a yields 


f(a) =4(a)-O+r, 
so that r= f(a). Thus f = g- (x—a)+ f(a). This leads to the following result. 


Corollary A.1.15 Given f € F |x] anda € F, the linear polynomial x — a is a factor 
of f if and only if f(a) =0, ie, ifais a root of f. . 


Using this corollary and induction, one easily obtains the following bound on the 
number of roots of a polynomial. 


Corollary A.1.16 Let f € F |x] be nonconstant. Then f has at most deg(f) roots in 
the field F. a 
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In Chapter 3, we show that by going to a larger field, a polynomial f € F [x] has 
exactly deg(f) roots, provided that we take the multiplicities of the roots into account. 

Another application of Theorem A.1.14 is the Euclidean algorithm for computing 
the greatest common divisor (or gcd) of two polynomials f,g € F[x], at least one of 
which is nonzero. Recall that gcd(f, g) is the monic polynomial of maximum degree 
in F |x] which divides both f and g. If g 40, we compute gcd(f,g) by repeatedly 
applying the division algorithm until we get a zero remainder: 


f=dgtn, deg(ro) < deg(g), 
S=qirotr, deg(r) < deg(ro), 
To=garitnm, deg(r2.) < deg(r,), 


Ya = Qnt20 nti + in+25 deg(rn+2) < deg(rn+1), 
Tn+l = Gn+3tn+2 +0. 


Then one can prove that gcd(f,g) is the monic polynomial obtained by multiplying 
Yn+2 by a suitable constant. On the other hand, if g = 0, then one easily sees that 
gced(f,0) = f. In general, the greatest common divisor has the following three 
properties: 

e For any h € F[x], A divides gcd(f,g) <=> h divides both f and g. 

e gcd(f,g)=1 <> f and g are relatively prime in F [x]. 

e gcd(f,g) =Af + Bg for some A,B € F[x]. 


One can also use Theorem A.1.14 to determine the ideals of F |x]. 


Theorem A.1.17 Every ideal of F |x] is of the form (f) = {fg | g € F|x]} for some 
f € Fix. . 


This is proved in most abstract algebra courses. Recall that the basic idea of the 
proof is that if J C F [x] is a nonzero ideal, then pick f € /\ {0} of minimal degree. 
Then one proves / = (f) using the division algorithm. 

In general, an integral domain in which every ideal is principal is called a principal 
ideal domain, or PID. It follows that Z and F [x] are both PIDs. 

One can also find unique generators for ideals in F[x]. For the zero ideal, the 
unique generator is of course 0. For nonzero ideals, we can use monic polynomials 
to give unique generators as follows. 


Proposition A.1.18 Every nonzero ideal of F |x] can be written uniquely as (f) where 
f is monic. a 


Be sure you can prove this proposition. 

In the ring of integers Z, prime numbers play a central role. For F [x], the corre- 
sponding objects are irreducible polynomials. Recall that a nonconstant polynomial 
in F |x] is irreducible over F if it is not a product of polynomials in F |x] of strictly 
smaller degree. 
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An important result proved in most abstract algebra texts is that every nonconstant 
polynomial in F [x] can be factored into a product of irreducibles, where the factor- 
ization is unique up to order and multiplication by constants. In the terminology of 
Section A.5, F[x] is a unique factorization domain, or UFD. 

Another important result is that the ideal (f) C F[x] is maximal if and only if the 
polynomial f € F[x] is irreducible over F. This is proved in Chapter 3 when we 
study the existence of roots. 

In general, it is not easy to test whether a given polynomial f € F |x] is irreducible 
over F. When deg(f) = 2 or 3, any nontrivial factorization of f must have a factor 
of degree 1. By Corollary A.1.15, having a factor of degree | in F[x] is equivalent to 
having a root in F. Thus we have proved the following. 


Lemma A.1.19 If f € F [x] has deg(f{) = 2 or 3, then f is irreducible over F if and 
only if f has no roots in F. rT] 


See Sections A.3, A.5, and 4.2 for more about factorization. 


A.2. COMPLEX NUMBERS 


In this appendix, we take a naive point of view and regard C as the set of numbers 
a+ bi, where i= /—1 and a,b €R. A rigorous algebraic construction of C is 
presented in Chapter 3. Given z = a+ bi, we define 

Re(z) =a __ the real part of z, 

Im(z) =b the imaginary part of z, 


Z=a-—bi_ the complex conjugate of z. 


Furthermore, the absolute value of z= a+ biis 
\z| = Vez = Va? + bz 


A. Addition, Multiplication, and Division. Addition and multiplication of 
complex numbers are defined by 


(a+ bi)+(c+di)=(a+c)+(b+d)i, 
(a+ bi)(¢ + di) = (ac — bd) + (bc +.ad)i 


and satisfy 


(AS) an 


Zz = 


NI ANI 


+W, 
W. 


Under these operations, C is a ring with additive identity 0 = 0 + 0i and multi- 
plicative identity 1 = 1 +i. To see that C is a field, note that if z= a+ bi 4 0 (which 
means that a and b are not both 0), then 
1 1 1 a-bi_ a—bi a b 


_ _ —~478 a lke 
z at+biat+bia—bi @+h @+h a+b’ 
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If we think of z= a+ bi as the point (a,b) in the plane, we can also represent z 
using polar coordinates (r,@). Since r = Va? + b? = |z|, we get the picture 


z=atbi— (a,b) 


Polar coordinates in the complex plane C 


In this situation, we follow Euler and define 
(A.6) e? = cos + ising. 


The relation between polar and Cartesian coordinates implies that a = |z|cos@ and 
b = |z|sin@. Hence 


z=atbi= |z\cosO + |z\sindi = |zJe”. 
This is the polar representation of z. In Exercise 1, you will prove that 


(AD) |zw| = |z| |v, 
; eioib — gi(0+d). 

It follows that if z = |z| e and w = |w|e’®, then the polar representation of zw is 

(A.8) zw = |z||w| e+?) 


Thus we multiply lengths and add angles when we multiply two complex numbers. 


B. Roots of Complex Numbers. We next consider the roots of the polynomial 
x" — a, where a € C and n € Z is positive. The solutions of x” — a = 0 are called 
the nth roots of a. To describe the nth roots, write our given complex number a as 
a =|ale!®. We will assume that a 4 0, so that |a| is positive. We seek a complex 
number w such that w” = a. If we write w = |w|e!?, then Exercise 2 implies that 


(A.9) w" = |w|"ein?, 
so that the equation w” = a becomes 
|w|"eine _ lale’®. 


This equation is clearly satisfied if |w|" = |a| and n¢d = 9, i.e., if |w| = ¢/|a| and 
¢ = 6/n. In other words, the complex number 


(A.10) w= */|al el/" 
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is an nth root of a. 

In polar coordinates, we can change the angle by an integer multiple of 27 without 
changing the point. For the polar representation a = |a|e’’, this means that we can 
write a = |a|e!°+2"™) for any m € Z. Then, if we apply (A.10) to this representation 
of a, we get the nth root 


(A.11) w= lal elOt2am)/n_ 


As we vary m € Z, we claim that this gives precisely distinct nth roots of a. 

To prove this, note that (9+ 27m )/n and (9+ 22m,)/n differ by an integer 
multiple of 27 if and only if m; =m, mod n. Hence, in (A.11), we can assume that 
m=0,1,...,2— 1, which gives the nth roots 


(A.12) Sale”, VajeO#2/m,.., 2/[afeOt2no-I)/n, 


Note that 0/n, (0+ 27)/n,...,(@+27(n—1))/n are n distinct angles in the plane 
since no two differ by an integer multiple of 27. Thus we have proved the following. 


Proposition A.2.1 Every a £0 in C has n distinct nth roots (A.12). These are the 
roots of the polynomial x" — a € C{x]. . 


By Corollary A.1.15, each root gives a linear factor of x” — a. This implies that 
(A.13) x"“-a=(x-% Jal e®/") s+ (x- Ja| ef 842m (n—1))/ny_ 
We can simplify the above formulas using the nth roots of unity. If we set 
¢ _ eeni/n 
n , 


then (A.9) implies that ¢7” = e2tm/_ Tt follows that when a = 1, (A.12) shows that 
the roots of x” — 1 are given by 


LG Gos Gtl 
These are the nth roots of unity. In this case, the factorization (A.13) becomes 
(A.14) x" —1=(x—-1)(x-¢)---(e@-CP4). 


Returning to the nth roots of a € C, we can now simplify (A.12). By (A.7), 


gil+2nm)/n _ eiO/n g2nim/n = eff/ncm 


Then the nth roots of a given by (A.12) can be written as 


2 _ 
Wi, GiW1, GoW) Gr ‘wi,  wherew,=% lale 


ig /n 
and the factorization (A.13) simplifies to 


x"-a=x"—wi 


A.15 
an) = (x —wi)lx—Gwi)a— Gn) eh), 
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Qni/n = cos(2=) + isin(#) is easy to work out. 


For small n, the root of unity ¢, =e 7 


For example, standard facts from trigonometry imply that 


G = cosa +isinn = —1, 
¢, = cos(2£) + isin(#) = a 1+iv3), 
C4 = cos( )+isin(>) = 


Thus the square roots of unity are 1,—1; the cube roots are 1,¢,,¢?; and the fourth 
roots are +1,-ti. We often denote the cube root of unity ¢, by w. Exercise 6 below 
will show how the formula for w = ¢, follows from the quadratic formula. 

Roots of unity appear in several places in the text: in Chapter 8, where we study 
solvability by radicals; in Chapter 9, where we compute the minimal polynomial 
of ¢,; and in Chapter 10, where we explore Gauss’s work on the constructibility of 
regular polygons. 


Exercises for Section A.2 


Exercise 1. Prove (A.7). 


Exercise 2. Let z = |z|e’® be the polar representation of z € C. Prove that 


n_in@ 


2" = |z|"e n>OinZ, 


using induction on n and (A.8). 


Exercise 3. This exercise will discuss De Moivre’s formula, which states that 
(cos +isinO)” =cosnO+isinnO, n> OinZ. 


(a) Show that De Moivre’s formula follows from (A.6) and Exercise 2. 

(b) Use De Moivre’s formula and the binomial theorem for n = 4 and 5 to express cos 46, 
sin4@, cos56, and sin5@ in terms of cos@ and sin@. 

(c) Use De Moivre’s formula and the binomial theorem to prove that cos n@ can be written as 
a polynomial in cos @ with integer coefficients. 


Exercise 4. Use a calculator to find a seventh root of 3 + 2i. Note that 6 = tan~'(2). 


Exercise 5. For n = 4, 5, and 6, draw a picture to show how the nth roots of unity form the 
vertices of a regular n-gon inscribed in the unit circle in the complex plane. 


Exercise 6. The cube root of unity w = ¢, is a root of x* ~ 1 = (x—1)(x? +x41). Use the 
quadratic formula to show that w and w” are given by 7 +iV3). 


Exercise 7. Use ¢7 = 1 and |¢, | = 1 to show that ¢, = C?-! = 1/¢,. 


Exercise 8. This exercise will derive an explicit formula for the fifth root of unity ¢, using the 
factorization x° — 1 = (x—1)(x4 42° +2? 4+x41). 
(a) Use Exercise 7 to show that if x = ¢,, then x+1/x= 2cos(2Z). 
(b) Explain why x = ¢, satisfies x? +.x+ 14 1/x+1/x? =0. Then show that y = x+ 1/x is 
a root of y>+y—1. 
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(c) Use part (b) to conclude that cos(2Z) = 4(-1 + V5). Then show that 


C5 = ~ 


_ritv5 i 5+¥v5 
4 2V 2° 


Chapter 10 explains how this relates to the straightedge-and-compass construction of the 
regular pentagon. 


Exercise 9. In this exercise, you will give two proofs of the identity 
14G+G+--+6'=0, n>OinZ. 


(a) Show that this identity follows from (A.14) by comparing the coefficients of x"~’. 
(b) Give a second proof using the factorization x" — 1 = (x—1)(x"7'+---+1). 
(c) More generally, use part (b) to show that m 4 0 mod n implies that 


LEG EG be tC = 0. 
Also determine the sum on the left-hand side when m = 0 mod n. 


Exercise 10. The eighth root of unity ¢, is given by cos($) + isin($) = wll +i). 


(a) Show that the eighth roots of unity are given by +1, +i, Wa (41+). 
(b) Use the factorization of x* +1 given at the end of Section A.3 to show that 


1 = (x-1) (x4 I)? + I) (0? + V2e4- I) (0? — V2e4-0), 


and explain how this factorization relates to part (a). 


A.3. POLYNOMIALS WITH RATIONAL COEFFICIENTS 


We next discuss the polynomial ring Q[x]. In this case, we often take a polynomial 
with rational coefficients and multiply it by a constant to clear denominators, giving 
a polynomial with integer coefficients. In general, we let Z[x] denote the ring of 
polynomials in x with coefficients in Z. 

As is well known, we can describe the rational roots of f € Z|x] as follows. 


Proposition A.3.1 Let f = a,x" +-+-+a 9 € Z[x] be nonconstant. If p/q € Q is a 
root of f, where p,q € Z are relatively prime, then p\ao and q|ap. . 


Note that combining Lemma A.1.19 and Proposition A.3.1 gives an algorithm 
for deciding whether a polynomial in Q|x] of degree 2 or 3 is irreducible over Q. 
In Section 4.2, we show that a similar algorithm exists when the degree is greater 
than 3. The crucial result, due to Gauss, is that we can reduce factorization in Q|x] 
to factorization in Z|x]. This is Gauss’s Lemma, which is stated as follows. 


Theorem A.3.2 Suppose that f € Z{x] is nonconstant and that f = gh where g,h © 
Qlx]. Then there is a nonzero 5 € Q such that g = 6g and h = 5~'h have integer 
coefficients. Thus f = gh in Z[x]. 
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Proof: Lets € Zbe acommon denominator for the coefficients of g, so that g = 1g 
for g, € Z{x]. Then let r € Z be the greatest common divisor of the coefficients of 
gi. Factoring out r enables us to write 


1 r 
&§ = —81 = -82, 
Ss Ss 


where g2 € Z[x] has relatively prime coefficients. Similarly, we can write 
1 t 
h= —h, = ~h, 
u u 


where hy, h2 € Z|x] and hz has relatively prime coefficients. 
Let 6 = £, and observe that 


sor 
= gp a meEZ 
68 = = - 82 = 82 € Za], 
t rt 
Sh=*.-m="h 
Sou AY) 


If we can show that su|rt, then =h2 € Z[x] will follow, since hz € Z{x]. This will 
prove the theorem. 

Hence it remains to show that su|rt. For this purpose, we will prove that if p is 
prime, then p*|su implies that p*|rt (do you see why this implies su|rt?). So pick a 
prime p and suppose that p*|su. Then write 


82 =byx' +---+bo, 


(A.16) 7 
ha = Cmx™ +--+ +00. 


Since b;,...,b9 have no nontrivial common factors, we can find an index i > 0 
such that p|bo,...,p|b;-1 and p{b;. Similarly, there is an index j > 0 such that 
Plco,...,plej—-1 and pt{c;. 

Multiplying the expressions for g2 and h2 given in (A.16), we see that the coefficient 
of x'*/ in gohp is di4; = bociz; +--+ +bi4;C0. We can write this in the following 
form: 


(A.17) dix; = bocigj + +++ + bj-10j41 +bicj + bigicj-1 + +++ + Dit je0- 
—_—_—_—_—_—_—_—=—“=—_—oees’ —_—_—_—_—_—eo’ 
p divides bo,...,b;-1 p divides cj_1,...,C0 


Since p{bjc;, this shows that dj; ; is relatively prime to p. Thus gced(p*,d;;) = 1. 
Next observe that f = gh = £g2- 5h» implies that 


(A.18) suf = rtgzhz . 


Since p*|su and f € Z[x], we see that p* divides the coefficient of x'*/ on the left-hand 
side of (A.18). However, the coefficient of x'+/ on the right-hand side of (A.18) is 
rtd;, ;, and it follows that p* divides rtd; ;. Since p* is relatively prime to dj; ;, we 
conclude that p* must divide rt, which is what we needed to show. This completes 
the proof. 2 


We will generalize Gauss’s Lemma in Section A.5. 
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A.4 GROUP ACTIONS 


Some of the most interesting groups arise as the symmetries of particular mathe- 
matical objects. This leads to the notion of a group action. Here is the precise 
definition. 


Definition A.4.1 Let G be a group and X be a set. Then an action of G on X is a 
function G x X — X, written (g,x) > g-x, such that 

(a) e-x=xforallxE X,. 

(b) g-(h-x) = (gh)-xforall g,hE GandxeEX. 


Here are some simple examples of group actions. 


Example A.4.2 The symmetric group S,, acts on the set {1,2,...,n}. Ifo €S, and 
ié€ {1,2,...,n}, then o- iis just o(i). <p> 


Example A.4.3 Let GL(n, R) be the set of invertible n x n matrices with real entries. 
This is a group under matrix multiplication and acts on R” in the following way: if 
A € GL(n,R) and v € R’, then A- vis the matrix product Av, where we think of v as 
a column vector. <p 


Example A.4.4 Let S! = {e' | 6 € R} be the set of complex numbers of absolute 
value 1. This is a group under multiplication of complex numbers, and S! acts on C 
by multiplication. <p 


We next define some important concepts related to a group action. 


Definition A.4.5 Let a group G act ona set X, and let x € X. 
(a) The orbit of x is the set G-x = {g-x| g € G}. 
(b) The isotropy subgroup of x is the subgroup G, = {g € G| g-x =x}. 


Here are some examples of orbits and isotropy subgroups. 


Example A.4.6 In the action of S, on X = {1,2,...,n}, the orbit of any i € X is all 
of X, and the isotropy subgroup of i consists of all permutations which fix 7. Do you 
see why the isotropy subgroup is isomorphic to S,,_1? <p 


Example A.4.7 Let H = (c) be the cyclic subgroup generated by o € S,. Then H 
acts on X = {1,2,...,}. In Exercise 1, you will show that the orbits of this action 
correspond to the decomposition of o into a product of disjoint cycles. <p 


Example A.4.8 In the action of S' on C, consider a point z 4 0. Then the orbit S! -z 
of z is the circle of radius |z| centered at the origin, and the isotropy subgroup of z is 
trivial. <p> 


If G acts on X, then one can easily show that 


x~y <=> x=g-yforsome g €G 
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is an equivalence relation on X whose equivalence classes are the orbits of G. It 
follows that X is a disjoint union of orbits. Furthermore, the isotropy subgroups of 
points on the same orbit are related as follows 


(A.19) Ge.x = Gig! 


for g€ Gand x€X. Proofs of these assertions can be found in Section 1.12 of 
Volume I of [Jacobson]}. 

The Fundamental Theorem of Group Actions relates orbits to cosets of the isotropy 
subgroup as follows. 


Theorem A.4.9 Let G act on X, and let G, C G be the isotropy subgroup of x € X. 
Then: 


(a) There is a one-to-one correspondence 
{left cosets of G, in G} ~ G-x. 


(b) If Gis finite, then 
[G:G,] = |G-x. 


Thus |G| = |G,||G-x|, so that |G| is divisible by both |G,| and |G - x}. 


Proof: Let G/G, = {gG, | g € G} be the set of left cosets of the isotropy subgroup. 
Then define y : G/G, — G-x by 


p(gGx) =g°x. 


We first need to show that this map is well defined. If 1G, = 92G,, then g; = goh 
for some h € G, (be sure you know why). Then 


81°X = (gah) -x = go: (h-x) = g2°x, 


where the second equality follows from Definition A.4.1, and the third follows from 
h © G,. This proves that y is well defined. 
Since every y € G- xis of the form y = g-x for some g € G, we see that 


y=g-x=p(gG,). 


Thus ¢ is onto. To show that ¢ is also one-to-one, suppose that y(g1G,) = (g2G;). 
By the definition of y, this implies that 


81'X = 82°%. 
Using the properties of group actions, we obtain 
x=e-x=(g)'gi)-x=8) |: (g1-x) =a; | - (g2-x) = (8; '82)-% 


Thus 8) 82 € G,, so that g,G, = g2G,. Hence y is one-to-one. From here, the rest 
of the theorem follows easily. | 
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Chapter 12 discusses the special case of this theorem discovered by Lagrange in 
his study of the roots of polynomials. 

The following definition is used in Section 6.3 when we study how the Galois 
group acts on the roots of a polynomial. 


Definition A.4.10 The action of G on X is transitive if for every x,y € X, there is 
g © G such that g-x=y. 


More on group actions may be found in Section 1.12 of Volume I of [Jacobson]. 
Exercises for Section A.4 


Exercise 1. As in Example A.4.7, let H = (co) be the cyclic subgroup generated by o € S,. 
Assume that 7 4 e and that 

O = 71T72°°*Tr 
is the decomposition of o into a product of disjoint cycles. Suppose that 7) = (i) --- iz). 


(a) Use (A.1) to prove that {i),...,i;} is the orbit of i; under the action of H. 
(b) Explain why Theorem A.4.9 implies that / divides the order of o. 


Exercise 2. In the group action considered in Example A.4.8, find the orbit and isotropy 
subgroup of 0 € C. 


Exercise 3, The symmetric group $3 is sometimes introduced as the symmetry group of an 
equilateral triangle A. 
(a) In the language of this section, explain how S3 acts on A. You may assume that the 
vertices of A are labeled 1,2, 3. 
(b) For each subgroup of S3 given in Example A.1.6, determine all points p € A whose 
isotropy subgroup is the given subgroup of $3. Also describe the orbit of p. 


Exercise 4. Given a group G, define g-h = ghg™! for g,h € G. Prove that this is a group 
action of G on itself. We say that G acts on itself by conjugation. Then: 
(a) Prove that the orbit G-g is the conjugacy class C, of g and that the isotropy subgroup is 
the subgroup C(g) consisting of all elements of G that commute with g. 
(b) Let G be finite. Prove that [G: C(g)] = |C,|. 


Exercise 5. Prove that a group G acts transitively on a set X if and only if G-x =X for all 
x € X if and only if G-x =X for some x € X. 


A.5 MORE ALGEBRA 


Here are some further results about groups, rings, fields, and polynomials that will 
be used in the text. 


A. The Sylow Theorems. Let G be a finite group, and let p be a prime dividing 
the order of G. Then a subgroup H Cc G is called p-Sylow subgroup if |H| = p", 
where p” is the highest power of p dividing |G|. Here is the basic result concerning 
Sylow subgroups. 
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Theorem A.5.1 Let p be a prime dividing the order of a finite group G. Then: 
(a) (First Sylow Theorem) G has a p-Sylow subgroup. 
(b) (Second Sylow Theorem) Any two p-Sylow subgroups of G are conjugate in G. 


(c) (Third Sylow Theorem) Let N be the number of p-Sylow subgroups of G. Then 
N=1 mod p, and N divides |G|. 7 


A proof of the First Sylow Theorem can be found in [Herstein, Thm. 2.12.1], 
and the Second and Third Sylow Theorems are proved in [Herstein, Thms. 2.12.2 
and 2.12.3 and Lem. 2.12.6]. 

The Sylow Theorems have some nice applications in Chapters 8 and 14. 


B. The Chinese Remainder Theorem. The following result will be useful in 
several places in the text. Given a positive integer n, let [a], € Z/nZ denote the 
congruence class of a modulo n. 


Lemma A.5.2 Let n and m be relatively prime positive integers. Then the map 
[a]nm > ([a]n, [a]m) gives a well-defined ring isomorphism 


Z/nmZ ~ Z/nZ x Z/mZ. 


Proof: Mf [alan = [b]nm, then nm|a — b, from which we conclude that ([a]n,[@]m) = 
([bln[B]m). Hence the map is well defined, and it is easy to see that it is a ring 
homomorphism. Furthermore, if [a],m is in the kernel, then nla and mla, which 
implies that nmla, since n and m are relatively prime. Thus [4]am = [O]nm, so that the 
map is one-to-one. It is then onto, since both rings have order nm. r 


C. The Multiplicative Group of a Field. Given a field F, its multiplicative group 
is F* = F \ {0}, which is a group under multiplication by the definition of field. The 
fact that a polynomial of degree m has at most m roots in a field implies the following 
interesting property of F*. 


Proposition A.5.3 Let G C F* be a finite subgroup of the multiplicative group of a 
field F. Then G is cyclic. 


Proof: First observe that G is Abelian because F is a field. Then Theorem A.1.7 
implies that G is isomorphic to a product of cyclic groups, say 


G2Z/mZx---x Z/m,Z, 


where m,...,, are integers > 1. Thus |G| =m, ---m,. If r= 1, then we are done. 
So assume that r > 2. 
Let m =1cm(m),...,m,) be the least common multiple of the m;. It is then easy to 


verify that g” = 1 for every g € G. Since G is a subgroup of F*, it follows that every 
g € Gis a root of x” — 1 € Fx]. Hence this polynomial has at least |G| = mj, ---m, 
roots in F. But, as noted above, x” — 1 has at most m roots in F, since F is a field. 
Thus 


m=Icm(m,,...,m,) > my, ---m,, 
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which clearly implies that Icm(m,...,m,) = m,---m,. This in turn implies that 
m,...,m, are pairwise relatively prime (be sure you understand why). However, if 
n and m are relatively prime, then by Lemma A.5.2, there is a ring isomorphism 


Z/nZ x Z/mZ ~ Z/nmZ, 


which is also a group isomorphism if we forget multiplication. Using this repeatedly, 
we obtain a group isomorphism 


GrZ/mZx---xZ/m,Z~ Z/(m ---m,)Z. 
This completes the proof of the proposition. 7 


D. Unique Factorization Domains. Given a ring R, a unit of R is an element of 
R that has a multiplicative inverse in R. The set of all units of R is denoted R*. Note 
that R* is a group under multiplication. 

Now let R be an integral domain. We say that r € R is irreducible if it is not a unit 
and r = ab, a,b € R, implies that a or b is in R*. 


Example A.5.4 Given a field F, the units of the polynomial ring F [x] are the nonzero 
elements of F, i.e., F[x]* = F*. Furthermore, the irreducible elements of F[x], as 
defined above, are precisely the irreducible polynomials of Fx]. <> 


Here is the precise definition of unique factorization domain. 


Definition A.5.5 An integral domain R is a unique factorization domain, or UFD, 

if the following two conditions hold: 

(a) Every nonzero element of R is either a unit or a product of irreducibles. 

(b) If rp--- rp = 51 +++), where r1,...,7%k,51,--.,51 € R are irreducible, then k = l, 
and there is a permutation o € S, such that for each 1 <i<k there is a unit 
a; € R* such that r; = ajsqvi). 


The basic example of a UFD is the ring of integers Z. Another important class of 
examples come from polynomial rings. Here is the basic result. 


Theorem A.5.6 Let R be a UFD, and let R|x| be the ring of polynomials in a variable 
x with coefficients in R. Then R|x| is a UFD. 7 


A proof can be found in [Herstein, Thm. 3.11.1] or [Jacobson, Vol. I, Thm. 2.25]. 
This result implies, for example, that Z|x] is a UFD. In Chapter 2 we will discuss 
the ring F[x;,...,x,] of polynomials in x;,...,x, with coefficients in F. Using 
Theorem A.5.6 and induction on the number of variables, it is straightforward to 
prove the following. 


Corollary A.5.7 If F is a field, then F|x1,...,Xn| is a UFD. a 


In the course of proving Theorem A.5.6, one needs the following generalization 
of Gauss’s Lemma (Theorem A.3.2). 
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Theorem A.5.8 Let R be a UFD with field of fractions K. Suppose that f € R\x] is 
nonconstant and that f = gh where g,h € K(x]. There is a nonzero 5 € K such that 
& = 6g andh = 5~'h have coefficients in R. Thus f = gh in R{x]. . 


The proof is identical to the proof of Theorem A.3.2 given in Section A.3. An im- 
mediate corollary of Theorem A.5.8 is that if f € R[x] is irreducible and nonconstant, 
then it is also irreducible in K[x]. Furthermore, if f,g € R[x] are relatively prime and 
nonconstant, then Theorem A.5.8 implies that they are also relatively prime in K |x]. 


APPENDIX B 
HINTS TO SELECTED EXERCISES 


This appendix contains hints to selected exercises in the text. 


Section 1.1 (pages 9-10) 
Exercise 2. Hint: Explain why @ = w”. 


Exercise 3. Hint: By choosing the correct square root of g”, show that Cardan’s formulas 
reduce to y; = ¥/—q, y2 = w?¥/—9, and y3 = w.—q when p = 0. 


Exercise 7. Hint: First show that all three polynomials give the same z: but a different z2. 


Exercise 8. Hint: Use Example 1.1.1 and Exercise 7. 


Section 1.3 (pages 21-22) 

Exercise 1. (c) Hint: f’(y1) = 3(y1 — a) (y: — 8) and f(a) = (a — yt) (a — y2)(a — ys). 
Exercise 2. (a) Hint: Use (1.22). 

Exercise 3. (a) Hint: Remember that y;, y2, y3 are distinct. 

Exercise 4. (b) Hint: Part (c) of Exercise 1 and A # 0 imply that f(a) and f(@) are nonzero. 
Exercise 6. Hint: By part (a) of Exercise 3, A > 0 implies that p 4 0. 
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Exercise 7. Hint: Dividing y> — 15y — 4 by y — 4 leads to a quadratic equation. 


Exercise 11. (b) Hint: Use Exercise 3 of Section A.2. (c) Hint: w = 2", 


Section 2.1 (page 30) 


Exercise 1. Hint: Give a proof by contradiction. Unique factorization will be useful. You may 
assume that x and y are irreducible in F [x,y]. 


Exercise 3. (c) Hint: Let a; = --- = @, = —a in the corollary. 


Section 2.2 (pages 39-42) 

Exercise 1. Hint: Let i) <--- <i,. Ifi, > 1, then the exponent of x; in x;,---xi, is 0. 
Exercise 2. (b) Hint: Use part (a). 

Exercise 4. Hint: Express the coefficients of (2.17) in terms of o1,02,03 evaluated at the a,j. 


Exercise 7. Hint: If rt € S, is fixed, you will need to explain why ro ranges over all elements 
of S; as o does. 


Exercise 9. (c) Hint: Use the well-ordering property of the nonnegative integers, which states 
that any strictly decreasing sequence of nonnegative integers is finite. 


Exercise 11. Hint: Use the method of Example 2.2.6 and Exercise 4. 


Exercise 16. (c) Hint: You can’t use the formulas of Chapter 1. So you need to compute 
(1—w)*(1—w?)?(w —w?)?, 


Exercise 18. Hint: Use the Newton identities and explain why every s, is a polynomial in the 
o; with coefficients in Z. 


Exercise 20. Hint: Suppose that 02 = P(s,,...,5.). Then evaluate this at x) = --- =x, =0 
and atx} =x2=1,453=-:-=x,=0. 


Section 2.3 (page 46) 
Exercise 1. Hint: To find the roots of y? + 2y? — 3y+ 5, use the Mathematica command 
N[Solve[y*3 + 2y°2—3y+5==0,y]] 
or the Maple command 
fsolve(y°3+2*y°2—3*xy+5=0,y, complex); 


Note that fsolve normally only finds real roots, but by specifying the complex option, it will 
find all roots, real and complex. 


Exercise 4. Hint: If the roots are x; ,x2,x3, then one way for this to happen is x) = (x2 +.x3)/2, 
which gives the equation 2x, — x2 — x3 = 0. There are two other ways this can happen. Then 
take the product of all three ways. 

Section 2.4 (pages 51-52) 


Exercise 2. Hint: Use Theorem 2.4.4. 
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Exercise 5. Hint: Use Proposition 2.4.1. 
Exercise 8. (ce) Hint: F has characteristic 4 2. (f) Hint: Use Exercise 5. 


Exercise 10. (a) Hint: F[oi,...,0n] ~ F[u1,..-, Un] is a UFD. 


Section 3.1 (page 62) 


Exercise 2. Hint: The definition of rng homomorphism given in Section A.1 requires that y 
preserve the multiplicative identity. Also remember that a homomorphism is one-to-one if and 
only if its kernel is {0}. 

Section 3.2 (page 69) 


Exercise 2. (c) Hint: Solve the second equation of part (b) for y, and substitute the result into 
the first. Also remember that b 4 0. 


Exercise 3. Hint: Apply the IVT to x? — a on a suitably chosen interval. 
Exercise 5. (b) Hint: This follows from Lemma 3.2.3 and Exercise 4 with R replaced by F. 


Exercise 6. Hint: Use part (b) of Exercise 1. 


Section 4.1 (pages 80-81) 
Exercise 1. Hint: When f(a) = 0 for f € F[x] nonzero, what equation is satisfied by 1/a? 


Exercise 4, Hint: First use Lemma 4.1.9 to show that F(ai,..-,a-) C F(a1,...,Q,). Then 
use the lemma a second time. 


Exercise 6. Hint: Show that the ring homomorphism F[x1,...,%n] > F[ai,...,@n] given by 
x; ++ a; is an isomorphism. Then explain why this extends to an isomorphism of the fields of 
fractions. 


Exercise 7. (a) Hint: Remember that g(a) 4 0. 

Exercise 8. (a) Hint: First explain why it suffices to show that /3 ¢ Q(/2). You may assume 
that V2, V3, and V6 are irrational. (b) Hint: What is a — V3? 

Section 4.2 (page 88) 


Exercise 1. (c) Hint: How many roots does h — g have? What is its degree? Corollary A.1.16 
will be useful. 


Exercise 5. (b) Hint: Note that ¢,, ¢,, C4, Cg Cg» and ¢,, are also roots of x°4 — 1, What are the 
minimal polynomials of these numbers? 


Exercise 9. Hint: Suppose that (g/h)’ = t, where g,h € k(t] are relatively prime. Show that 
g” = th? would imply that first g and then / are divisible by rt. 


Section 4.3 (page 94) 


Exercise 6. Hint: Compute [F (a, 3) : F] in two ways. 
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Section 4.4 (page 98) 


Exercise 2. (a) Hint: Consider the field extensions 
Qc Q(V2) c Q(V2, V5) c Q(V2, V5,712) C+ CL. 


Exercise 3. (a) Hint: Use Gauss’s Lemma. (b) Hint: The minimal polynomial of w over Q is 
2 
x +x+1. 


Exercise 6. Hint: If an element of F(x) is algebraic over F, write it as p/g where p,q € F [x] 
are relatively prime. If a,(p/q)"” +---+ a0 = 0, where a; € F, then clear denominators and 
use unique factorization to conclude that p,q € F. 


Exercise 7. Hint: Take a € L and consider the minimal polynomial of a over F. 


Exercise 9. Hint: Use Exercise 3. 


Section 5.1 (page 106) 


Exercise 3. Hint: Use Exercise 4 of Section 4.3 to show that L = F(a) for some a € L. Then 
let f be the minimal polynomial of @ over F. 


Exercise 4. Hint: Consider —w, where w = e?™!/3, 

Exercise 5. Hint: In Section 4.2, we used Maple and Mathematica to factor f in L[x]. 
Exercise 6. Hint: Compute J2+y2- J/2- V2. 

Exercise 7. (b) Hint: If a € Lis a root of f, then compute f(a+ 1). 

Exercise 8. (b) Hint: Use Proposition 4.2.5 and the method of Exercise 5 of Section 4.3. 
Exercise 9. (a) Hint: Combine [L : F] = n! with the proof of Theorem 5.1.5. 

Exercise 11. (a) Hint: Consider F C F(a) C L, where a € L is a root of F. 

Exercise 13. Hint: Apply the proposition to Q(v/2) C L. Part (a) of Exercise 7 from Section 4.1 
will be useful. 


Section 5.2 (page 109) 
Exercise 3. (c) Hint: Compute (x — a)’. 


Exercise 4. Hint: Use Theorem 4.4.10 and Exercise 1 from Section 4.4. 


Section 5.3 (pages 117-118) 
Exercise 2. (a) Hint: Treat the cases p = 2 and p > 2 separately. 
Exercise 3. (b) Hint: Use Lemma 5.3.10. 


Exercise 4. Hint: Recall how A(f) and A(f,) are obtained from A. You may assume that A 
is a polynomial in Z[o1,..., on]. 


Exercise 6. Hint: Remember that the given polynomial need not be irreducible. 


Exercise 7. (b) Hint: Look at the exponents of the nonzero terms of f. 
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Exercise 8. (a) Hint: Write F = k(u)(t) and use Exercise 9 of Section 4.2. (b) Hint: Remember 
that k has characteristic 3. 


Exercise 9. (b) Hint: Express 8 as a polynomial in a. 


Exercise 13. Hint: Use Lemma 5.3.5. 


Section 5.4 (page 123) 
Exercise 4. (c) Hint: Use part (b) and Example 5.4.4. 
Exercise 5, (a) Hint: First explain why a+ A$ and a+ pf lie in F(a+ AQ). 


Exercise 7, Hint: Use Theorem 5.4.1 and the previous exercise. 


Section 6.1 (pages 129-130) 

Exercise 4. (b) Hint: Think about kernels of ring homomorphisms and ideals of fields. 
Exercise 6. Hint: What is 6 - 15? 

Exercise 7. (a) Hint: See part (b) of Exercise 4. (b) Hint: Regard L as a vector space over F, 
and show that o is a linear map. Now use standard results from linear algebra. 

Section 6.2 (page 132) 

Exercise 3, (a) Hint: Use the method of Exercise 5 of Section 4.3. 


Exercise 6. Hint: Use Exercise 11 of Section 5.1. 


Section 6.3 (page 136) 


Exercise 4. Hint: Write down the roots of f explicitly, and determine how the elements of 
Gal(L/Q), as described in the proof of Theorem 6.2.1, act on the roots. Then Jook at the 
corresponding permutations in S4. 


Exercise 6. Hint: Use Theorem A.4.9 from Section A.4. 


Section 6.4 (page 142) 


Exercise 4. Hint: Can you find an inverse function? 


Section 6.5 (pages 145-146) 
Exercise 3. Hint: Lemma 6.1.3 will be useful. 


Exercise 4, Hint: Use the nth roots of unity, and show that 6;(x) can be chosen to be x! provided 
the roots of x” — 1 = 0 are labeled appropropriately. 


Exercise 7. Hint: Use Exercise 6. 


Section 7.1 (pages 153-154) 
Exercise 3. Hint: Use Proposition 7.1.6. 


Exercise 5. Hint: Use part (b) of Proposition 7.1.7 twice. 
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Exercise 7. Hint: This is similar to Exercise 5. 


Exercise 8. Hint: Let a} = a,a2,...,a@, be as in the definition of h, and consider the subgroup 
H = {a € Gal(L/F) | (a) = a}. Then study how the left cosets of H act on a. 


Exercise 12. (a) Hint: Do not use the Theorem of the Primitive Element—give a direct proof. 
(b) Hint: Use Lemma 5.3.5. 

Section 7.2 (pages 160-161) 

Exercise 2. Hint: Show that ¢~'Gal(L/aK)o C Gal(L/K) follows from a~'(oK) = K. 
Exercise 7. Hint: This follows from the argument used to prove (a) <> (b) in Theorem 7.2.5. 


Exercise 9. (b) Hint: Use Exercise 7. 


Section 7.3 (pages 166-167) 
Exercise 5. (b) Hint: Use the automorphism L ~ L which sends ¢ to it. 
Exercise 7. (a) Hint: Proposition 4.2.5 will be useful. 


Exercise 9. Hint for (a) > (b): If Gal(L/F) = {e,0,7, a7}, then consider the fixed fields of 
(a) and (rT). See also Exercise 12 of Section 7.1. 


Exercise 10. Hint for (c) > (a): If Q(@) 4 Q(8), then let L = Q(a@, 8), and show that there 
are 0,7 € Gal(L/Q) such that Q(q) is the fixed field of (o), and Q() is the fixed field of 
(r). Then see where Q(a + ) fits in the Galois correspondence, and explore how o,7, 07 act 
ona+t 8. 


Exercise 11. Hint: Use the Galois correspondence and Exercise 8 of Section 7.2. 


Exercise 13. Hint: Show that the Galois closure constructed in Proposition 7.1.7 can be 
realized as a subfield of L. Then use the Galois correspondence and Exercise 12. 


Exercise 14. Hint: First use part (d) of Exercise 4 of Section 6.2 to show that Gal(Q(¢,)/Q) 
is Abelian. 
Section 7.4 (page 173) 


Exercise 7. Hint: Use the Galois correspondence and Proposition 6.3.7. 


Section 7.5 (pages 185-187) 
Exercise 2, Hint: R = Fy] is a UFD with field of fractions K = F(y). 
Exercise 3. (a) Hint: In A and B, the coefficient of each power of x is a rational function in y. 


Exercise 12. (c) Hint: In part (b), we “broke” one of the symmetries of the polyhedron by 
moving some of the vertices. To obtain the groups in part (c), you need to “break” some of the 
symmetries in a similar way. 

Section 8.1 (page 196) 


Exercise 4. Hint: The proof is similar to part (a) of Exercise 2. 
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Section 8.2 (page 200) 

Exercise 1. (b) Hint: See Exercise 8 of Section 7.3. 

Exercise 2, Hint: Use Definition 8.2.1, the Tower Theorem, and [L: Q] = 3. 

Exercise 3. (a) Hint: Consider the intersection of all subfields of ZL containing K; and Ko. 
Exercise 4. (a) Hint: Adapt the proof of Theorem 7.1.7 to show that F(a1,...,@,) is a Galois 
closure. (b) Hint: Use Proposition 5.1.8. 

Section 8.3 (page 210) 

Exercise 2. Hint: Splitting fields. 

Exercise 3. Hint: See Exercise 9 in Section A.2. 

Exercise 4. (b) Hint: Remember that ¢, € Fi-1. 

Exercise 7. (a) Hint: Use Exercise 6. 


Exercise 8. (a) Hint: As in Exercise 3 of Section 7.4, a} + 67 = 2A. Also, when using 


the computer algebra system, you should write w as 1(—1+i/3). (b) Hint: Recall that 


2 
o1 =X1 +X. +3. Also, what is | tw+w?? 

Section 8.4 (page 215) 

Exercise 1. Hint: Use Cauchy’s Theorem (Theorem A.1.5). 


Exercise 2. Hint: When i, j,k,/ are distinct, verify that (i j)(kl) = (i jk)(jkl). You will also 
need to consider the case when i, j,k,/ are not distinct. 


Exercise 5. Hint: If o,7 are elements of H different from e, then what can you say about 
2 29 
O° ,0T,T? 


Exercise 6. Hint: If H; C G/H is normal, then what can you say about 7~'(H,), where 
a :G—>+ G/H takes g to the coset gH? See Exercises 3 and 4 from Section 8.1. 
Section 8.5 (page 220) 


Exercise 1. Hint: Suppose that F C L; C M1, where F C M, is radical. Explain why we 
can assume that M, is the splitting field of some polynomial g € F{x]. Then let Ly C M2 
be a splitting field of g regarded as a polynomial in L2[x]. Prove that F C M2 is radical. 
Theorem 5.1.6 will be useful. 


Exercise 3. Hint: Use Proposition 5.3.8. 


Section 8.6 (pages 226-227) 


Exercise 3. Hint: If you use a computer to draw the graph of f, it seems clear that there are 
four real roots. To make this rigorous, you should use the Intermediate Value Theorem. 


Exercise 5. (a) Hint: What is (a +i)? in characteristic p? 
Exercise 6. (a) Hint: Unique factorization. 


Exercise 7. Hint: Follow the proof of (b) = (a) of Theorem 8.3.3. You will also need to 
explain why primitive mth roots of unity exist for all m not divisible by p. 
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Section 9.1 (pages 236-237) 
Exercise 1. When n = 1, note that [0] = [1] in Z/12Z, so that [0] is in (Z/1Z)*. 


Exercise 5. Hint: Maple and Mathematica can factor polynomials over Q. Also, what is the 
degree of ®195(x)? (See Exercise 13 for another approach to computing ®195(x).) 


Exercise 6. (a) Hint: Analyze the proof of Theorem 2.2.2. (b) Hint: Use Lemma 5.3.10 and 
Lemma 9.1.2. 


Exercise 8. (f) Hint: p{n. 

Exercise 11. Hint: Use (9.4). 

Exercise 15. Hint: Use Exercise 14 and (9.4). 
Exercise 16. (b) Hint: Use part (a) and Lemma 9.1.1. 


Section 9.2 (pages 252-253) 
Exercise 6. Hint: Use Lemma 9.2.4 and Exercise 9 of Section A.2. 


Exercise 9. (b) Hint: To five decimal places, (—1+ 17) /2 = 1.56155 and (—1— 717) /2 = 
—2.56155, yet the quadratic formula says that (8, 1) = (—1+ 717) /2 for some choice of sign. 
(c) Hint: When computing ,/(4, 1)? — 4(4, 3), do not compute (4, 1)? using part (b). Rather, 
use Proposition 9.2.9 to express (4, 1)” in terms of (4,2), (4,3), and (4,17) = 4. 


Exercise 10. Hint: See Exercise 3. 


Exercise 11. (a) Hint: [g°] generates Hy, and [g*/%] generates Hyg. (b) Hint: First prove that 
[Ly (w) : Lyq(w))] = [Ly : Lyq] using the method of Exercise 5 of Section 4.3. Then look at 
the argument used in the proof of the General Case of Theorem 8.3.3. (c) Hint: First use 
Propostion 9.2.6 to study how o’ acts on f-periods. 


Exercise 12. (a) Hint: Use Exercise 16 of Section 9.1. 


Exercise 13. Hint: First show that Hg C (Z/17Z)”* is the subgroup of squares. Then, for p{a, 
explain why x? = a mod 17 has a solution if and only if [a] € Hs. 


Exercise 14. (a) Hint: Label the roots as a; = r8' fori= 1,...,n—1. 


Section 10.1 (pages 268-269) 


Exercise 4. (a) Hint: You can think of é; as the line through the points (ui,v1) and (#2,v2) in 
the plane R?. Consider the cases 4) = #2 and u; % ua separately. 


Exercise 5. (a) Hint: Do you remember how to construct an equilateral triangle? 


Exercise 6. Hint: Argue as in Example 10.1.9 that such a trisection implies that cos 20° 
is constructible. Then use cos(¥) = 5 and the identity cos(30) = 4cos*9 — 3cos@ from 
Section 1.3. 


Exercise 7. Hint: Combine the construction used in Proposition 7.1.7 with the fact that C is 
algebraically closed. 


Exercise 8. (a) Hint: Mimic the proof of Theorem 10.1.6. 
Exercise 10. (d) Hint: Use part (c) and x = a/3. 
Exercise 12. (b) Hint: r/3. 


HINTS TO SELECTED EXERCISES 545 


Section 10.2 (page 273) 

Exercise 1. Hint: Note that if m is odd, then x” + 1 = x"—(-1)™. 

Exercise 5. (a) Hint: Use Ge= €,s-1 and Theorem 10.1.6. (b) Hint: Use Exercise 16 of 
Section 9.1. 

Section 10.3 (pages 284-286) 

Exercise 1. (a) Hint: Q; and Q> are the reflections of P; and P, about !. 

Exercise 3. (c) Hint: How does part (b) relate to (10.8)? 

Exercise 4. Hint: Use implicit differentiation. 

Exercise 5. (b) Hint: What are the roots of x? + 2x + 1? (c) Hint: Use (10.11). 


Exercise 6. Hint: The distance between a; and a2 equals the distance between the reflections 
of these points about 2. 


Exercise 13. Hint: Let M be the midpoint of QR. Use the circle with center M and radius 1/2 
to show that ZOMP = 2ZQRP. 


Exercise 14. (a) Hint: A perpendicular from R to the x-axis will meet the x-axis at a point S. 
This gives AROS. Then let T bisect the segment RP, and prove that AROT is congruent to 
AROS. 


Exercise 15. (b) Hint: This is very challenging. A solution can be found on page 128 of [15] 
in the references to Chapter 10. 

Section 11.1 (pages 300-301) 

Exercise 2. Hint: See Section A.1. 

Exercise 3. Hint: Show that x?” — x is a factor of x”" — x whenever m divides n. 

Exercise 9. (a) Hint: Use Exercise 4 of Section 9.1. (b) Hint: (p, f) = pZ[a] + fZ[al. 


Exercise 11. Hint: If a is a root of f in some splitting field, then [F,(a) : F,] =n. 


Section 11.2 (pages 308-310) 


Exercise 4. (b) Hint: Divide by the smallest power of p appearing in the formula and then 
work modulo p. 


Exercise 5. Hint: If a” = 1, then write m = p’d where s > 0 and gcd(d, p) = 1. 
Exercise 10. (b) Hint: Explain why f; and f; are relatively prime for i # j. 
Exercise 11. (b) Hint: Show that each R; is a field. 

Exercise 12. (c) Hint: What is the factorization of x? — x? 


Exercise 17, Use Theorem 11.2.7. 


Section 12.1 (pages 331-334) 


Exercise 1. Hint: Look at the proof of (7.1). 
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Exercise 2. (b) Hint: Another root of the Ferrari resolvent is yo = (23) -y1 = x13 +22x4. How 
are H(y,) and H(yz2) related? 


Exercise 5. (a) Hint: For the choice of sign with x,, x; as roots, the constant term is x;x;. But 
what is the constant term of the corresponding equation in (12.11)? Then do the same for the 
other equation. 


Exercise 9. (b) Hint: First show that 0-y = y for o € H, using the fact that multiplication by 
o permutes the elements of H. Then show that oy # for all o ¢ H, using the fact that the 
exponents are distinct. Remember that two polynomials are different if one has a term which 
doesn’t appear in the other. 


Exercise 11. Hint: If you have done Exercise 19, then use part (f) of that exercise. Otherwise, 
follow the proof of Theorem 12.1.10, and pick y € L with H as isotropy subgroup. Then let 


PI = Y; P25.-6 Ps 
be the distinct rational functions o - y for 0 € Ap. Show that s = {A, : H] and that 
G={o €A,|o-9;) = v7; for all i=1,...,5} 
is anormal subgroup of A,. Then use Theorem 8.4.3 and the argument of Theorem 12.1.10. 
Exercise 12. (b) Hint: Use the Galois correspondence. 
Exercise 13. Hint: First explain why a - a” = a” implies that o -a = ¢/a for some j. 
Exercise 14. (b) Hint: Look at the proof of Theorem 12.1.4. 
Exercise 20. Hint: Use part (f) of the previous exercise. 


Exercise 21. (a) Hint: What obvious subgroup of S, has (n — 1)! elements? (b) Hint: Exercise 3. 


Section 12.2 (pages 345-347) 


Exercise 1. Hint: If W, C W2 U---UW,,, then intersect both sides with W, and use the inductive 
assumption. 


Exercise 3. Hint: In the second proof of Theorem 12.1.6, replace y; and w;, i= 1,...,5, 
with V. and aj), @ € Sn. Thus the polynomial &(x) from (12.5) will be a sum of n! terms. 
Show that W(x) € F[x] by arguments similar to those used in the proof of Proposition 5.2.1 in 
Section 5.2. Be sure that your argument explains where the separability of s(y) is used. 


Exercise 4. Hint: Let g be the minimal polynomial of 8, and let M be a splitting field of fg 
over F. 


Exercise 6. (d) Hint: Use (8.3). 
Exercise 9. Hint: Use Exercise 14 and F CKNLCK. 
Exercise 12. (b) Hint: Use what you did in Exercise 10. 


Exercise 15. (a) Hint: See the proof of Theorem 7.2.7. 
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Section 12.3 (pages 354-355) 
Exercise 1. Hint: We are in characteristic 0. 


Exercise 5. (a) Hint: A nonzero polynomial in one variable of degree N has at most N 
roots in a field. Also, when n > 1, note that g = Yiro 8j(X1,-..,Xn—1) 4, where at least one 
8; © F[xi,...,Xn~1] is nonzero. 


Exercise 6. (b) Hint: Use (12.7). (c) Hint: Use Exercise 4 of Section 9.1. 


Section 13.1 (pages 366-368) 
Exercise 1. (a) Hint: Take o € Gal(L/F) and let r = $1(). Also, explain why aj = 8,-1(). 


Exercise 5. Hint: If g doesn’t split completely over F, then show that the roots of g in some 
splitting field are b, u-t /v where b,u,v € F and v ¢ F?. 


Exercise 6. (a) Hint: Consider (x — a1)(x — a2) and (x — a3)(x — a4). 


Exercise 15. Hint: First show that |G| = [L: F] is a power of 2. 


Section 13.2 (pages 383-385) 


Exercise 3. (b) Hint: Let 7 = (i) ...is). Show that o(i1) = i; implies that o(i2) = 41 and 
so on. Then show that o = 7*!, (d) Hint: How many 5-cycles are there in Ss? (e) Hint: 
Remember that 5 divides |G]. 


Exercise 8. (a) Hint: Explain why 7 -u; = tu; for some j, and prove that some even 
permutation takes u; to uj. Then use Exercise 7 to determine the sign. 


Exercise 9. (a) Hint: See the proof of part (a) of Theorem 13.1.1. (€) Hint: Comparing the 
coefficients of y°, y*, and y? gives equations which can be solved using Maple or Mathematica 
to express bo, b4,be in terms of a,b. Then compare the coefficients of y* and substitute the 
formulas for b2,b4,b¢ to get an equation involving only a and b. Now factor. A different 
argument is needed in characteristic 3, 


Exercise 10. (c) Hint: Use the method described in the hint to part (e) of Exercise 9. Here, 
you will need a different argument in characteristic 5. 


Exercise 16. (¢) Hint: Use Section 7.4. 


Section 13.3 (pages 397-399) 


Exercise 1. (a) Hint: First multiply f by a suitable integer so that ap,...,a, € Z. Then multiply 
n—1 


by ao 
Exercise 2. Hint: Use Exercise 6 of Section 9.1. 
Exercise 5. (b) Hint: Use Galois theory. 


Exercise 8. (b) Hint: Let y2 = (234) - vy and v3 = (34)-w2. Show that Gy-y = {+} and 
Gr- 2 = {4¢2,+93}. (c) Hint: Study the action of Gy on y, ¢2, and v3, where y2 and 3 
are defined in the hint to part (b). (d) Hint: Use the hints to parts (b) and (c). 


Exercise 9. Hint: By linear algebra, the map sending a matrix to the corresponding linear map 
iS one-to-one. 
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Exercise 10. (a) Hint: If F is a field, then every subspace of F” of dimension n — | is defined 
by an equation a;x; +-+++an%, = 0 where a1,...,a@, € F are not all zero. Furthermore, this 
equation is unique up to multiplication by a nonzero element of F. (c) Hint: If VC F? has 
dimension 2, then it has a basis which can be completed to a basis of F 3. Now use part (b). 


Exercise 11. Hint: For a field F, let F*/, be the subgroup of GL(n,F) consisting of all 
nonzero multiples of the identity matrix. Then PGL(n, F) = GL(n, F)/F* 1, and PSL(n, F) = 
SL(n, F)/(F*hOSL(n, F)). 


Exercise 13. (6) Hint: Compute isotropy subgroups. 


Section 13.4 (pages 409-410) 


Exercise 4. (a) Hint: Use Section 2.3 to express the universal version of s, in terms of 
01,02,03,y. Then specialize to the coefficients of f. You will be surprised at the size of the 
polynomials involved. 


Exercise 6. (a) Hint: If f = gh in Q[x,...,xn], then pick positive integers r,s as small as 
possible such that rg,sh € Z[x1,...,xn]. Now apply unique factorization to rsf = (rg)(sh), 
and remember that a prime p € Z is irreducible (in the UFD sense) in Z{x,...,x,]. (b) Hint: 
Take an irreducible factorization of s, in Z[{ui,...,%n,y], and apply part (a) together with the 
fact that Q[u1,...,un,y] isa UFD. 


Exercise 10. (b) Hint: Look at subgroups of S4 with four elements. 


Section 14.1 (pages 418-419) 
Exercise 2. (b) Hint: mo(g) +n[G: H] = 1. 


Exercise 4. Hint: Remember that g-hH = (gh)H gives an action of G on the set of left cosets 
of HinG. 


Exercise 8. (a) Hint: By adjoining roots one at time, prove that the splitting field satisfies 
{L: F] < 40. (b) Hint: Regard the roots as the nonzero vectors in F3 and pick roots a, 6,y such 
that y = a+. 

Section 14.2 (pages 427-429) 

Exercise 2. (a) Hint: Show that $32.52 contains a transposition. 

Exercise 3. (d) Hint: Use part (b) of Exercise 7 of Section 14.3. 


Exercise 8. (b) Hint: For 0 € Gal(L/F), we have T € Sm such that (Ri) = R-¢i)._ What is 
the kernel of the map sending o to 7? The Galois correspondence from Chapter 7 will also be 
useful. 


Exercise 9. (a) Hint: Use Proposition 9.2.8 with f = | and f’ = f. 
Exercise 10. Hint: What power of p divides (p”)!? 
Exercise 11. Hint: Use Exercise 4. 


Exercise 15. Hint: Show that |G;|n = |G|, where G; is the isotropy subgroup of i € {1,...,}. 
Given a subgroup G; C H C G, consider the subsets (rH) -i, 7 € G. Show that these partition 
{i,...,} into blocks that are stable under the action of G. 


Exercise 16. (b) Hint: Use the congruence classes modulo p in Z/ pL. 
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Section 14.3 (pages 441-443) 


Exercise 2. (c) Hint: Consider the map Yat? A. (d) Hint: Semidirect products are discussed 
in the Mathematical Notes to Section 6.4, and (6.10) does the special case when n = 1 and 
q=p. 


Exercise 3. (a) Hint: Consider the map which sends Yaoy Oo. 
Exercise 4. (c) Hint: Use linear algebra. 


Exercise 5. (b) Hint: You need to study what happens when N C A x {eg} or N C {e4} x B. 
(c) Hint: Conjugate (a~',b~') € N by (a1,b7') CAXB. 


Exercise 6. (a) Hint: Explain why conjugation by g gives an automorphism of N. 
Exercise 7. (b) Hint: Consider Akh7'k7'. 
Exercise 12. (b) Hint: Use part (c) of Exercise 2. 


Exercise 13. (b) Hint: Given two (n — 2)-tuples of elements of {1,...,”} consisting of distinct 
points, show that there are exactly two elements T,7’ € S, which map one (n — 2)-tuple to the 
other. Then show that 7 and 7’ differ by a transposition. 


Exercise 14. (a) Hint: A 2 x 2 matrix has determinant 0 if and only if either its first column is 
zero or its first column is nonzero and the second column is a multiple of the first. (e) Hint: 
Apply the Fundamental Theorem of Group Actions to GL(3, F2) acting on F} \ {0}. 


Exercise 15. Hint: First show that F? has p+ 1 lines through the origin, and explain why 
PSL(2,F,) permutes these lines. Then study the cases p = 2 and p = 3 in detail. 


Exercise 16. Hint: In the proof of Proposition 14.3.10, replace A and its conjugates with the 
minimal normal subgroups of G. 


Exercise 17. Hint: Use induction and the action of GL(n, F,) on F; \ {0}. 


Section 14.4 (pages 459-460) 
Exercise 1. Hint: Use Exercise 2 of Section 14.3. 
Exercise 3. Hint: Use Proposition 14.3.4. 


Exercise 9. (a) Hint: Use Lemma 14.4.3. (b) Hint: To analyze the kernel, write m € Ker(¢) 
as a linear combination of / and g. For the image, use the Fundamental Theorem of Group 
Homomorphisms. (d) Hint: For g = (? ~4), show that C(g) = {(5 ~) [r,s EF), r’° +s” £0}. 
Then use the Fundamental Theorem of Group Homomorphisms to determine the size of the 
kernel. 


Exercise 12. (c) Hint: The multiplicative group of a finite field is cyclic. (d) Hint: The only 
hard case is when p = 7. Show that M2 C M3; implies that M2 maps to Aq C S4 when you map 
to PGL(2,F7). Then use part (a). (e) Show that the image of (M2)o in PGL(2, Fs) has order 8 
and contains the subgroup H of Lemma 14.4.4. Recall that subgroups of index 2 are normal. 


Exercise 13. Hint: Consider the map F x Go > GL(2, F,) defined by (A,g) > Ag. 
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Section 15.2 (pages 481-482) 


Exercise 9. (a) Hint: Gauss’s Lemma implies that P(u),Q(u) are relatively prime in Q[y}. 
Thus R(u)P(u) + S(u)Q(u) = 1 for some R(u),S(u) € Q[u]. Now replace u with u’. 


Exercise 11. Hint: Let a = sinx and 8 = siny. 


Section 15.3 (page 489) 

Exercise 2. (b) Hint: Use Proposition 15.2.1 and part (d) of Exercise 3 of Section 15.2. 
Exercise 6. Hint: Use the identity (15.13) with x = 4(z+ 20) and y= $(z—2o). See also 
[Abel, Vol. I, pp. 277-278]. 

Section 15.4 (page 503-504) 


Exercise 3, Hint: See the proof of (a) < (b) of Proposition 3.1.1. Also recall if a € Z[i] is 
prime, then a = By, 8, € Zii], implies that 8 or ¥ is a unit, i-e., lies in Z[i]*. 


Exercise 9. Hint: See the hint for Exercise 9 of Section 15.2. 


Exercise 12. Hint: Follow the proof of Theorem 4.2.3. You will also need Gauss’s Lemma 
over Z[i], which holds by Theorem A.5.8 because Z/i] is a UFD. 


Exercise 13. Hint: Use the geometric series 1/(1 +x) = )772.9(—1)*x* to write 


[ee] 


= S (=) (aut +aaut)* 


k=0 


1 
l+ayu+::-+aqut 


The series on the right-hand side makes sense because 
(ayut+---+agu’)* = uk(ay +--+» +aqu?')*. 
So a given power of u appears in only finitely many terms. 
Exercise 14. (d) Hint: For co = 1, use y’(0) = 1, and for c; = — 7, use y’?(z) = 1— ¢*(z) 
with g(z) =z+--- and y’(z) = 145ciz*+---. What is the coefficient of z*? 
Section 15.5 (page 512) 


Exercise 1. Hint: Use the fact that Z[i] is a PID to prove that if a, 8 are relatively prime, then 
ya+ 668 = 1 for some 7,6 € Zi]. 


Exercise 5. Hint: Use part (b) of Lemma 15.4.2. 


Exercise 6. Hint: Show that the obvious map Z[i]/a6Z[i] > Z[i]/oZ|i] x Z[i]/BZ{i] is 
one-to-one. Then use part (a) of Lemma 15.4.2. 


APPENDIX C 
STUDENT PROJECTS 


The material in the latter part of the book lends itself well to independent projects. In 
this appendix, we suggest some topics that students might find interesting. Most of 
the projects listed here are reasonably short, though a few are more ambitious. Many 
are based on optional sections of the text. Here is the list: 


e Abelian Equations. These equations and their relation to Abelian groups are 
discussed in Sections 6.5 and 8.5. The goal would be to explain why commutative 
groups are called “Abelian.” A more ambitious version of this project would 
involve looking at the Historical Notes to Section 15.5. 


e Automorphisms and Geometry. There are several projects involving Section 7.5: 
© Theorem 7.5.3 gives a classic argument from Galois theory and leads to 
some nice geometric examples of Galois groups. 
© Another project would be to study linear fractional transformations and 
stereographic projection. 
© More ambitious projects would be to classify finite subgroups of PGL(2,C), 


explore some invariant theory, or give a proof of Luroth’s theorem. See the 
Mathematical Notes to Section 7.5. 
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e The Casus Irreducibilus. As explained in Section 1.3, Cardan’s formulas for a 
cubic with real roots involve complex numbers. The goal of this project would 
prove that complex numbers are unavoidable, as explained in Section 8.5. 


Gauss and Roots of Unity. Here are two projects based on Section 9.2: 
© A project could use Gauss’s theory of periods to work out the Galois corre- 
spondence for a cyclotomic extension Q C Q(¢,), p prime. 
© Another project would be to focus on the properties of periods and derive 
Gauss’s amazing formula for cos(27/17) given in (9.19). 
Squarable Lunes. Read references [18] and [20] from Section 10.1 to learn more 
about the squarable lunes mentioned in the Historical Notes to the section. 


Regular n-gons. Theorem 10.2.1 characterized the n’s for which a regular n-gon 
can be constructed by straightedge and compass. The proof used the irreducibility 
of the cyclotomic polynomial ©,(x), which is not easy to prove. A nice project, 
based on Exercises 2-6 of Section 10.2, is to give a more elementary proof of 
Theorem 10.2.1 that uses the Schonemann-Eisenstein criterion. 


Origami. Section 10.3 leads to several possible projects: 
© Use origami constructions to trisect angles and duplicate the cube. 
© There is also the Galois theory of origami, presented in Theorem 10.3.6. 
© A student could also focus on Exercise 18 of Section 10.3, which character- 
izes the n’s for which a regular n-gon can be constructed by origami. 
The references for Section 10.3 can be used for other projects involving origami. 


Polynomials over Finite Fields. There are two projects based on Section 11.2: 
© The first would be to prove the formula for the number of monic irreducible 
polynomials in F, [x] given by Theorem 11.2.4. See (11.10) and Exercise 8. 
© The second would be to study cyclotomic polynomials modulo a prime p 
and, following Exercise 17, give a proof of the irreducibility of ®,(x) that 
does not use the Schonemann-Eisenstein criterion. 
Lagrange. Section 12.1 lends itself to several projects: 
© It is fun to work out the solution of quartic given in (12.11) (Ferrari) and 
(12.17) (Euler). Exercise 18 is relevant. 
© One project would be to explain how Lagrange’s formula for the degree of a 
resolvent (Theorem 12.1.4) relates to Lagrange’s Theorem in group theory. 
© A student could explore Lagrange’s version of Galois correspondence for 
the universal extension. See Theorems 12.1.6 and 12.1.9 and Exercise 9. 
© Anice project would be to explain how the affine linear group AGL(1,F,) is 
implicit in Lagrange’s work. This involves a study of the Lagrange resolvents 
(12.19). See also Exercise 15. 
© Finally, a student could explain how Theorem 12.1.10 messes up Lagrange’s 
inductive strategy for finding the roots of a polynomial. 
e Galois. Here are some projects based on Section 12.2: 
© A student could summarize Lagrange’s approach from Section 12.1 and 
explain how Galois went beyond Lagrange. 
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© A good project is to explain how Galois thought about his group and how it 
relates to the modern notion of Galois group. See Theorem 12.2.3. 

© Another project is to explain Galois’s strategy for finding radical expressions 
for roots. See Example 12.2.6 and Exercises 7 and 8. 


Kronecker. We proved the existence of splitting fields in Theorem 3.1.4, and in 
the Historical Notes to Section 3.1, we gave credit to Kronecker. A project based 
on Section 12.3 would give Kronecker’s construction of splitting fields. This 
requires Exercises 4-8 plus the Galois resolvents defined in Section 12.2. 


Quartic Polynomials. A student could report on the Galois group of quartics, 
following Section 13.1. A more ambitious version of the project would include 
material from Section 13.3 on quartics in all characteristics. 


Quintic Polynomials. Similarly, a student could report on the Galois group of 
quintics, following Section 13.2. A more substantial project would be to study 
the roots of quintics that are solvable by radicals, using the references mentioned 
in the Mathematical Notes to Section 13.2. 


Computing Galois Groups. Here are more projects based on Chapter 13: 
© Astudent could study how resol vents relate to Cardan’s or Ferrari’s formulas 
for the roots of a cubic or quartic and also to the Galois group of a quartic or 
quintic. The centerpiece of this project would be Section 13.3. 
© One project would be work out the details of why GL(3,F,) is the Galois 
group of x? — 154x+99 over Q. This involves Proposition 13.3.9 and 
Example 13.3.10. 
© A computer project would be to determine various Galois groups over Q by 
factoring modulo p for various primes p, as explained in Section 13.4. 
Polynomials of Prime Degree. Section 14.1 is a lovely continuation of the ideas 
of Chapter 8. Theorem 14.1.1 on solvable polynomials of prime degree is one of 
the great theorems of Galois. This is a very accessible project. 
Solvable Permutation Groups. Chapter 14 has lots of material for projects: 
© A student could explore how wreath products first arose in the context of 
Galois theory, following the Historical Notes to Section 14.2. 
© A student could discover why Galois invented finite fields, explained in the 
Historical Notes to Section 14.4. 
© A student could explore the structure of primitive solvable permutation 
groups, as described in Section 14.3. The math is surprisingly deep. 
© A nice project would be to classify imprimitive solvable permutation groups 
of degree p*, p prime, based on Section 14.2. This is surprisingly easy. 
© A harder project would be to classify primitive solvable permutation groups 
of degree p’, p prime, based on Section 14.4. 


e The Lemniscate. Abel’s theorem about geometric constructions on the lemniscate 


involves a rich combination of Galois theory, complex analysis, and number theory. 
A report on Abel’s theorem in Section 15.5 could be the basis for a nice project. 
A more ambitious project would be to digest all of Chapter 15. 
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computing Galois groups 


562 INDEX 


of an extension, 125-126, 130 


See also extension, cyclotomic and extension, 
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246, 344, 371, 377, 385, 425, 428, 506, 
510-512 
imprimitive of degree P’, 425, 457, 553 
of degree p?, 345 
of degree p, 345, 370, 413, 416-418, 553 
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according to Galois, 121, 336-337 
according to Gauss, 239, 241 
according to Lagrange, 322 
Theorem of, 119, 121-122, 130, 162, 198, 
219-220, 263, 332, 348, 354, 542 
primitive root 
modulo p, 242, 248 
of a finite field, 298, 310 
of unity, see root of unity, primitive 
principal ideal domain, see ring, principal ideal 
domain 
product, see Cartesian product and group, product 
of and semidirect product 
projective duality, 397 
projective linear group, xxvii, 396 
over a finite field, 438 
over C, 181-183, 449, 461 
three-dimensional, 398 
two-dimensional, xxvi, 179, 185-186, 443-444, 
446-447, 450, 455, 460-461, 549, 551 


matrix algebra, 93 
Maizat, B. H., 188 
Maistrova, A. L., 411 
Mazur, B., xix, 23 
M‘Kay, J., xix, 53, 410-411 
McMullen, C., 368, 410 
Meeks, K. I., 288 
Menaechmus, 267, 283 
meromorphic function, 485, 487 
minimal polynomial, see polynomial, minimal 
mira, 284 
mixed equation, 250 
Mobius 
function, xxvii, 237, 302 
inversion formula, 303, 310 

module system, 349, 351 

See also ideal 
modulus, see elliptic, integral, modulus of 
Mohr-Mascheroni Theorem, 264 
Mollame, V., 226 
Mollin, R. A., 254 
monomial, 26 
Moore, E. H., 300 
Mora, T., 188 
Mordell Conjecture, 396 
Moreno, C. J., 297, 311 
Mortimer, B., 462 
multiply transitive, see permutation group, 

multiply transitive 
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nth root, see root, nth 
nth root of unity, see root of unity, nth 
Nakagawa, H., 411 
Nakamizo, T., 410 
natural irrationality, 339, 341 
Theorem on, 340-341, 344 
n-division points 
of the circle, 464 
of the lemniscate, see lemniscate, n-division 
points of 
Nelson, R. B., 288 
Nemorarius, J., 287 
Newton, I., 38, 46 
Newton identities, 38, 42, 538 
Newton’s method, 368 
Nicomedes, 283, 286 
See also conchoid of Nicomedes 
Niederreiter, H., 296-297, 311 
Niven, I., 99, 311 
normal 
closure, 152-153 
extension, see extension, normal 
form, 45 
subgroup, see subgroup, normal 
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normalizer, xxvi, 159, 161, 167, 370, 414, 418, 
447, 451-455, 460-461 


oO 


@, see field, of origami numbers 
o(g), see order, of a group element 
octahedron, 181—182, 184, 186-187, 427, 449 
O’Nan-Scott Theorem, 438-439 
one-to-one correspondence, xxiv 
orbit, 318, 530-531 
order 

of a group, 515 

of a group element, xxiv, 516 
origami, 274, 282, 284, 552 

number, 276, 278-281 

See also field, of origami numbers 

O’Shea, D., 53, 70, 123 
Osofsky, B., 188 
ovals of Cassini, 469 


P 


, see field, of Pythagorean numbers 
@ (variant of 7), xxvii, 466 
go-function, xxviii, 487-488, 501 
addition law for, 488, 501 
complex multiplication for, 501 
See also elliptic, function 
¢-function, see Euler ¢-function 
PGL(n, F), see projective linear group 
PSL(n, F), see projective special linear group 
Pambuccian, V., 289 
paperfolding, 277, 284 
See also origami 
Pappus, 282-283, 286 
parabola, 267, 280, 283 
directrix of, 275, 284-285 
focus of, 275, 284-285 
intersection of, see intersection of parabolas 
simultaneous tangents to, 274-275, 282, 285 
tangent line to, 275, 285 
Parry, W., xix, 459 
Parshin, A. N., 70 
Pascal, B., 287 
Pascal, E., 287 
See also limagon 
period, xxvi, 239-240, 249, 272, 552 
expressible by radicals, 246-247, 250 
Galois action on, 240 
generalized, 242 
minimal polynomial of, 241 
product of, 243 
relation to Gauss sums, 249 
period lattice, see Abel’s function, doubly 
periodic, period lattice 
permutation, 517 
according to Galois, 338, 343 


projective plane, 283, 397 
projective special linear group, xxvii, 396, 548 
over a finite field, 438 
simplicity of, 438, 441 
three-dimensional, 438 
two-dimensional, 438, 443 
finite subgroups of, 183, 458, 461, 551 
pseudo-random number generator, 310 
pure equation, 250 
Pythagorean 
field, see field, Pythagorean 
number, see field, of Pythagorean numbers 
Theorem, 265 
triple, 365, 367 


Q 
Q, see field, of rational numbers 
quadratic 
formula, 3, 5, 13, 51, 64, 217, 252, 261, 277, 


475, 527, 544 
in characteristic 2, 52 
reciprocity, 87, 249 
residue, 407 
quadratrix, 267-269 
quartic polynomial, 329-330, 334, 358, 361-362 
Euler’s solution of, 325-326, 330, 334, 360, 552 
Ferrari’s solution of, 9, 217, 323-324, 325, 552 
Galois group of, 358, 363, 391-392, 553 
origami solution of, 279 
quaternion group, see group, quaternion 
quaternions, 61 
quintic polynomial, 184, 330, 368, 553 
Bring-Jerrard form, 377, 379-380, 382, 385 
equivalent, 380 
Brioschi form, 185, 380 
Galois group of, 139, 371, 373, 376, 553 
icosahedral solution of, 185 
solvable by radicals, 371, 377, 382 
formula for roots, 379 
unsolvability of, 185, 200, 214, 217, 220, 330 
quotient 
group, xxiv, 158, 160, 179, 516 
ring, xxv, 56-57, 60, 300, 349, 519 


R 


R, see field, of real numbers 
radical, 197, 216 

extension, see extension, radical 

prime, 85 

real, 221 

radix, 216 
Radloff, I., 188, 356, 462 
rational function, 26, 316 

similar, 320, 323 

symmetric, 38, 40, 52, 169, 316 
rational integral algebraic function, 250 
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See also polynomial 
Rationalitaéts-Bereich, 348 
reduction modulo p, 404, 407 
bad, 408 
regular 
heptagon, 278, 283 
n-gon, 235, 256, 464 
constructible, 270, 273, 552 
via origami, 288 
resolvent polynomial, xxvii, 153, 317, 319, 323, 
330-331, 379, 387-388, 398, 418, 553 
approximate, 387 
cubic, 5, 11, 317 
factorization of, 393 
Ferrari, xxvii, 52, 324, 332, 334, 358-359, 
361-362, 364, 386, 388-389, 546 
Galois, xxvii, 335, 337, 345, 347, 351, 353, 400 
importance of simple roots, 365, 386, 388, 418 
Kronecker, xxvii, 401 
Lagrange, see Lagrange, resolvent 
quadratic, xxvii, 390-391 
relative, 365, 390, 398 
sextic, xxvii, 371-373, 378, 382, 386, 389 
universal, 372 
restriction of a function to a subset, xxiv 
resultant, xxvi, 115 
Richelot, F, J., 272 
Riemann sphere, 180 
Rigatelli, L., 356 
ring, 349, 519 
division, 61 
homomorphism of, see homomorphism, ring 
integral domain, 26, 520 
irreducible element, 534 
isomorphism of, see isomorphism, ring 
of Gaussian integers, see Gaussian integers 
polynomial, xxv, 26, 93, 522, 534 
principal ideal domain, 29, 491, 523, 550 
quotient of, see quotient, ring 
unique factorization domain, 26, 59, 401, 409, 
491, 495, 520, 524, 534, 538, 540, 543, 
548, 550 
unit of, 491, 534 
Robbiano, L., 123 
Roberval, G., 288 


Romanus, A., 20 
Roney-Dougal, C. M., 462 
root, 216 


multiple, 15, 68, 109, 389-390, 496 
multiplicity of, 109, 523 
nth, 525-526 

See also root of unity 
of a polynomial, 522 

existence of, 59, 61, 351-352, 521 
simple, 109, 365 
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root of unity, 262 
cube, xxv, 6 
minimal polynomial of, see polynomial, 
cyclotomic 
in characteristic p, 224-226 
nth, xxiv, xxvi, 229, 257, 464, 526, 541 
primitive, 201, 204-206, 231, 236, 304, 543 
pth, 85, 136, 238 
Rose, J. S., 462 
Rosenberger, G., 70 
Rosen, M., 254, 311, 462, 513 
Row, T. S., 284, 289 
Rudakov, A. N., 70 
Ruffini, P., 214, 220, 330-331 
ruler, see marked ruler and straightedge 
Runge, C., 382 
Ruppert, W. M., 385, 411 


Ss 


SL(n, F), see special linear group 
S!, xxiv, 530 
S?, xxvi, 180 
Sn, see symmetric group 
Samuel, P., 411 
Schonemann, T., 87-88, 236, 298-299, 307, 353, 
503 
Sch6nemann-Eisenstein criterion, 84-85, 88, 92, 
136, 139, 271, 273, 398, 552 
according to Eisenstein, 502 
over the Gaussian integers, 498, 503-504, 508 
Schreier, O., 67, 119, 225 
Schur, I., 171 
semidirect product, 140, 142, 426, 428, 431, 441, 
519, 549 
semilinear group 
one-dimensional, 453-455, 461 
semilinear transformation, 453 
separable 
degree, see degree, separable 
extension, see extension, separable 
over a field, 111 
polynomial, see polynomial, separable 
Serre, J.-P, 146 
Serret, J., 343 
sextic resolvent, see resolvent polynomial, sextic 
Shenitzer, A., xix 
Short, M. W., 462 
Shurman, J., xix, 188, 288, 411, 513 
Siegel, C. L., 513 
sign of a permutation, see permutation, sign of 
signed arc length, 471 
signed polar distance, 471 
Silverman, J. H., 411, 514 
similar functions, see rational function, similar 
simple group, see group, simple 
simple pole, 485 


simple zero, 484 
See also root, simple 
Singerman, D., 188, 513 
Slavutin, E. ., 70 
Smirnova, G. S., 23, 70, 227, 356 
Smith, D. E., 70, 356 
socle of a group, 439, 443 
nonregular, 439 
regular, 439 
Soicher, L., 411 
Solovyev, Y., 513 
solvable 
by radicals, see polynomial, solvable by radicals 
extension, see extension, solvable 
group, see group, solvable 
Spearman, B. K., 410-411 
special linear group, xxvii, 396, 548 
over a finite field, 438 
two-dimensional, 443, 461 
spiral of Archimedes, 267, 269 
splits completely, 59, 107 
splitting field, see field, splitting 
squaring the circle, 262, 266-269 
Starr, N., xix 
starting configuration, 264 
Stauduhar, R. P., 387, 411 
Steinitz, E., 79, 122, 348, 354 
stereographic projection, 180, 551 
Stevenhagen, P., 289, 411 
Stewart, I., 556 
Stillwell, J., 411 
straightedge, 255 
and compass, 235, 245, 255, 464-465, 467, 
470-471, 474, 476, 478-479, 504, 506 
and dividers, 264 
See also marked ruler 
Stubhaug, A., 227 
subgroup, 515 
conjugate, 155, 358, 364, 366, 370, 373, 376, 
386, 388, 395, 398, 401, 403, 415, 422, 
424, 435, 451, 458, 531 
index of, 159, 319, 330, 415, 418, 516 
isotropy, 161, 318, 320, 437, 443, 530-531, 
546, 548 
normal, 155, 158, 160, 331, 344, 516 
minimal, 432-433, 435, 442, 455, 461, 549 
of acyclic group, 517 
Sylow, 195, 219, 370, 383, 415-416, 418, 
427-428, 448, 460, 532 
transitive, see symmetric group, transitive 
subgroup of 
substitution, 343 
See also permutation 
Suprunenko, D. A., 462 
Swallow, J., 556 


Sylow Theorems, 195-196, 220, 370, 383, 
415-416, 418, 427, 460, 533 
symmetric group, xxiv, 11, 30, 135, 138-140, 183, 
186, 316, 401, 408, 517 
index of subgroups of, 327, 330-331, 334 
normal subgroups of, 213 
solvable subgroups of, see permutation group, 
solvable 
subgroup of, see permutation group 
transitive subgroup of, 134, 136, 360, 363-364, 
415, 419, 421, 429-430, 435, 441 
classification for S4, 363 
classification for $5, 368-370, 406 
classification for S7, 394~395 
classification for S,, 2 < 32, 386 
classification for S ps 458 
symmetric polynomial, see polynomial, symmetric 
symmetry group, see group, symmetry 


T 


2-3 tower, 277-278, 281 
Tagaki, T., 512 
Tartaglia, 9, 19, 329 
Tate, J., 411, 514 
term, 25 
leading, 32, 39-40 
tetrahedron, 184, 187 
Thabit ibn Qurra, 283 
Theorem of the Primitive Element, see primitive 
element, Theorem of 
Theorem on Natural Irrationalities, see natural 
irrationality, Theorem on 
Thompson, J., 195, 227 
Tignol, J.-P., 556 
total degree, xxv, 26 
Tower Theorem, 91 
trace, 459 
transcendental 
extension, see extension, purely transcendental 
over a field, 73, 79 
transitive 
group action, see action of a group, transitive 
subgroup of S,, see symmetric group, transitive 
subgroup of 
transposition, 518 
triangle inequality, 64 
trinomial, 396 
equivalent, 396 
of large degree, 310 
trisecting the angle, 262, 264, 266-269, 283 
via intersection of conics, 286 
via marked ruler, 286 
via marked ruler and compass, 282, 285 
via origami, 274, 284 
Tschirnhaus, E. W., 329, 380 
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Tschirnhaus transformation, 379-380, 381, 385, 
389 
twice-notched straightedge, 279 
See also marked ruler and straightedge 
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UFD, see ring, unique factorization domain 
Unger, W. R., 462 
unique factorization domain, see ring, unique 
factorization domain 
universal 
extension, xxvi, 138, 169, 217, 316, 552 
Galois group of, 138 
polynomial, xxv, 14, 37, 138, 169, 217, 219, 
316, 355 
unsolvability for n > 5, 217 
unsolvability of quintic, see Abel, unsolvability of 
quintic and quintic polynomial, 
unsolvability of 
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Vandermonde, C. A., 50-51, 235, 330-331, 382 
Vandermonde determinant, 49 
van der Waerden, B. L., 356, 556 
van Roomen, A., see Romanus, A. 
Velleman, D., 70 
verging, 279, 282-283, 285, 287 
Videla, C. R., 289 
Viete, F., 9, 18, 20, 283, 329 
See also cubic polynomial, trigonometric 
solution of 
Viadut, S. G., 514 
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w, see root of unity, cube 

om (variant of 7), xxvii, 466 
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Weierstrass g-function, see go-function 
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Gy see root of unity, nth 
G,» see root of unity, pth 
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